Commit Graph

25378 Commits

Author SHA1 Message Date
Linus Torvalds
04b905942b Merge branch 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  serial: bcm63xx_uart: fix irq storm after rx fifo overrun.
  amba pl011: platform data for reg lockup and glitch v2
  amba pl011: workaround for uart registers lockup
  tty: n_gsm: improper skb_pull() use was leaking framed data
  tty: n_gsm: Fixed logic to decode break signal from modem status
  TTY: ntty, add one more sanity check
  TTY: ldisc, do not close until there are readers
  8250: Fix capabilities when changing the port type
  8250_pci: Fix missing const from merges
  ARM: SAMSUNG: serial: Fix on handling of one clock source for UART
  serial: ioremap warning fix for jsm driver.
  8250_pci: add -ENODEV code for Intel EG20T PCH
2011-06-28 11:14:55 -07:00
Jan Kara
08142579b6 mm: fix assertion mapping->nrpages == 0 in end_writeback()
Under heavy memory and filesystem load, users observe the assertion
mapping->nrpages == 0 in end_writeback() trigger.  This can be caused by
page reclaim reclaiming the last page from a mapping in the following
race:

	CPU0				CPU1
  ...
  shrink_page_list()
    __remove_mapping()
      __delete_from_page_cache()
        radix_tree_delete()
					evict_inode()
					  truncate_inode_pages()
					    truncate_inode_pages_range()
					      pagevec_lookup() - finds nothing
					  end_writeback()
					    mapping->nrpages != 0 -> BUG
        page->mapping = NULL
        mapping->nrpages--

Fix the problem by doing a reliable check of mapping->nrpages under
mapping->tree_lock in end_writeback().

Analyzed by Jay <jinshan.xiong@whamcloud.com>, lost in LKML, and dug out
by Miklos Szeredi <mszeredi@suse.de>.

Cc: Jay <jinshan.xiong@whamcloud.com>
Cc: Miklos Szeredi <mszeredi@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:13 -07:00
Chris Metcalf
507c5f1224 include/linux/compat.h: declare compat_sys_sendmmsg()
This is required for tilegx to be able to use the compat unistd.h header
where compat_sys_sendmmsg() is now mentioned.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:13 -07:00
Hugh Dickins
d9d90e5eb7 tmpfs: add shmem_read_mapping_page_gfp
Although it is used (by i915) on nothing but tmpfs, read_cache_page_gfp()
is unsuited to tmpfs, because it inserts a page into pagecache before
calling the filesystem's ->readpage: tmpfs may have pages in swapcache
which only it knows how to locate and switch to filecache.

At present tmpfs provides a ->readpage method, and copes with this by
copying pages; but soon we can simplify it by removing its ->readpage.
Provide shmem_read_mapping_page_gfp() now, ready for that transition,

Export shmem_read_mapping_page_gfp() and add it to list in shmem_fs.h,
with shmem_read_mapping_page() inline for the common mapping_gfp case.

(shmem_read_mapping_page_gfp or shmem_read_cache_page_gfp? Generally the
read_mapping_page functions use the mapping's ->readpage, and the
read_cache_page functions use the supplied filler, so I think
read_cache_page_gfp was slightly misnamed.)

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:12 -07:00
Hugh Dickins
94c1e62df4 tmpfs: take control of its truncate_range
2.6.35's new truncate convention gave tmpfs the opportunity to control
its file truncation, no longer enforced from outside by vmtruncate().
We shall want to build upon that, to handle pagecache and swap together.

Slightly redefine the ->truncate_range interface: let it now be called
between the unmap_mapping_range()s, with the filesystem responsible for
doing the truncate_inode_pages_range() from it - just as the filesystem
is nowadays responsible for doing that from its ->setattr.

Let's rename shmem_notify_change() to shmem_setattr().  Instead of
calling the generic truncate_setsize(), bring that code in so we can
call shmem_truncate_range() - which will later be updated to perform its
own variant of truncate_inode_pages_range().

Remove the punch_hole unmap_mapping_range() from shmem_truncate_range():
now that the COW's unmap_mapping_range() comes after ->truncate_range,
there is no need to call it a third time.

Export shmem_truncate_range() and add it to the list in shmem_fs.h, so
that i915_gem_object_truncate() can call it explicitly in future; get
this patch in first, then update drm/i915 once this is available (until
then, i915 will just be doing the truncate_inode_pages() twice).

Though introduced five years ago, no other filesystem is implementing
->truncate_range, and its only other user is madvise(,,MADV_REMOVE): we
expect to convert it to fallocate(,FALLOC_FL_PUNCH_HOLE,,) shortly,
whereupon ->truncate_range can be removed from inode_operations -
shmem_truncate_range() will help i915 across that transition too.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:12 -07:00
Hugh Dickins
072441e21d mm: move shmem prototypes to shmem_fs.h
Before adding any more global entry points into shmem.c, gather such
prototypes into shmem_fs.h.  Remove mm's own declarations from swap.h,
but for now leave the ones in mm.h: because shmem_file_setup() and
shmem_zero_setup() are called from various places, and we should not
force other subsystems to update immediately.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:12 -07:00
Vitaliy Ivanov
4d258b25d9 Fix some kernel-doc warnings
Fix 'make htmldocs' warnings:

  Warning(/include/linux/hrtimer.h:153): No description found for parameter 'clockid'
  Warning(/include/linux/device.h:604): Excess struct/union/enum/typedef member 'of_match' description in 'device'
  Warning(/include/net/sock.h:349): Excess struct/union/enum/typedef member 'sk_rmem_alloc' description in 'sock'

Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 16:06:19 -07:00
Linus Torvalds
a64227b085 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: queue: bring discard_granularity/alignment into line with SCSI
  mmc: queue: append partition subname to queue thread name
  mmc: core: make erase timeout calculation allow for gated clock
  mmc: block: switch card to User Data Area when removing the block driver
  mmc: sdio: reset card during power_restore
  mmc: cb710: fix #ifdef HAVE_EFFICIENT_UNALIGNED_ACCESS
  mmc: sdhi: DMA slave ID 0 is invalid
  mmc: tmio: fix regression in TMIO_MMC_WRPROTECT_DISABLE handling
  mmc: omap_hsmmc: use original sg_len for dma_unmap_sg
  mmc: omap_hsmmc: fix ocr mask usage
  mmc: sdio: fix runtime PM path during driver removal
  mmc: Add PCI fixup quirks for Ricoh 1180:e823 reader
  mmc: sdhi: fix module unloading
  mmc: of_mmc_spi: add NO_IRQ define to of_mmc_spi.c
  mmc: vub300: fix null dereferences in error handling
2011-06-27 14:55:43 -07:00
KAMEZAWA Hiroyuki
c6830c2260 Fix node_start/end_pfn() definition for mm/page_cgroup.c
commit 21a3c96 uses node_start/end_pfn(nid) for detection start/end
of nodes. But, it's not defined in linux/mmzone.h but defined in
/arch/???/include/mmzone.h which is included only under
CONFIG_NEED_MULTIPLE_NODES=y.

Then, we see
  mm/page_cgroup.c: In function 'page_cgroup_init':
  mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn'
  mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn'

So, fixiing page_cgroup.c is an idea...

But node_start_pfn()/node_end_pfn() is a very generic macro and
should be implemented in the same manner for all archs.
(m32r has different implementation...)

This patch removes definitions of node_start/end_pfn() in each archs
and defines a unified one in linux/mmzone.h. It's not under
CONFIG_NEED_MULTIPLE_NODES, now.

A result of macro expansion is here (mm/page_cgroup.c)

for !NUMA
 start_pfn = ((&contig_page_data)->node_start_pfn);
  end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

for NUMA (x86-64)
  start_pfn = ((node_data[nid])->node_start_pfn);
  end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

Changelog:
 - fixed to avoid using "nid" twice in node_end_pfn() macro.

Reported-and-acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 14:13:09 -07:00
Johannes Berg
04b7dcf979 wireless: unify QoS control field definitions
Move all that mac80211 has into the generic
ieee80211.h header file and use them. At the
same time move them from mask+shift to just
bits and rename them for consistent names.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-27 15:09:39 -04:00
Oleg Nesterov
0347e17739 ptrace: ptrace_reparented() should check same_thread_group()
ptrace_reparented() naively does parent != real_parent, this means
it returns true even if the tracer _is_ the real parent. This is per
process thing, not per-thread. The only reason ->real_parent can
point to the non-leader thread is that we have __WNOTHREAD.

Change it to check !same_thread_group(parent, real_parent).

It has two callers, and in both cases the current check does not
look right.

exit_notify: we should respect ->exit_signal if the exiting leader
is traced by any thread from the parent thread group. It is the
child of the whole group, and we are going to send the signal to
the whole group.

wait_task_zombie: without __WNOTHREAD do_wait() should do the same
for any thread, only sys_ptrace() is "bound" to the single thread.
However do_wait(WEXITED) succeeds but does not release a traced
natural child unless the caller is the tracer.

Test-case:

	void *tfunc(void *arg)
	{
		assert(ptrace(PTRACE_ATTACH, (long)arg, 0,0) == 0);
		pause();
		return NULL;
	}

	int main(void)
	{
		pthread_t thr;
		pid_t pid, stat, ret;

		pid = fork();
		if (!pid) {
			pause();
			assert(0);
		}

		assert(pthread_create(&thr, NULL, tfunc, (void*)(long)pid) == 0);

		assert(waitpid(-1, &stat, 0) == pid);
		assert(WIFSTOPPED(stat));

		kill(pid, SIGKILL);

		assert(waitpid(-1, &stat, 0) == pid);
		assert(WIFSIGNALED(stat) && WTERMSIG(stat) == SIGKILL);

		ret = waitpid(pid, &stat, 0);
		if (ret < 0)
			return 0;

		printf("WTF? %d is dead, but: wait=%d stat=%x\n",
				pid, ret, stat);

		return 1;
	}

Note that the main thread simply does

	pid = fork();
	kill(pid, SIGKILL);

and then without the patch wait4(WEXITED) succeeds twice and reports
WTERMSIG(stat) == SIGKILL.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:10 +02:00
Oleg Nesterov
087806b128 redefine thread_group_leader() as exit_signal >= 0
Change de_thread() to set old_leader->exit_signal = -1. This is
good for the consistency, it is no longer the leader and all
sub-threads have exit_signal = -1 set by copy_process(CLONE_THREAD).

And this allows us to micro-optimize thread_group_leader(), it can
simply check exit_signal >= 0. This also makes sense because we
should move ->group_leader from task_struct to signal_struct.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:10 +02:00
Oleg Nesterov
e550f14dc6 kill task_detached()
Upadate the last user of task_detached(), wait_task_zombie(), to
use thread_group_leader() and kill task_detached().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:09 +02:00
Oleg Nesterov
8677347378 make do_notify_parent() __must_check, update the callers
Change other callers of do_notify_parent() to check the value it
returns, this makes the subsequent task_detached() unnecessary.
Mark do_notify_parent() as __must_check.

Use thread_group_leader() instead of !task_detached() to check
if we need to notify the real parent in wait_task_zombie().

Remove the stale comment in release_task(). "just for sanity" is
no longer true, we have to set EXIT_DEAD to avoid the races with
do_wait().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:09 +02:00
Oleg Nesterov
45cdf5cc07 kill tracehook_notify_death()
Kill tracehook_notify_death(), reimplement the logic in its caller,
exit_notify().

Also, change the exec_id's check to use thread_group_leader() instead
of task_detached(), this is more clear. This logic only applies to
the exiting leader, a sub-thread must never change its exit_signal.

Note: when the traced group leader exits the exit_signal-or-SIGCHLD
logic looks really strange:

	- we notify the tracer even if !thread_group_empty() but
	   do_wait(WEXITED) can't work until all threads exit

	- if the tracer is real_parent, it is not clear why can't
	  we use ->exit_signal event if !thread_group_empty()

-v2: do not try to fix the 2nd oddity to avoid the subtle behavior
     change mixed with reorganization, suggested by Tejun.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:08 +02:00
Oleg Nesterov
53c8f9f199 make do_notify_parent() return bool
- change do_notify_parent() to return a boolean, true if the task should
  be reaped because its parent ignores SIGCHLD.

- update the only caller which checks the returned value, exit_notify().

This temporary uglifies exit_notify() even more, will be cleanuped by
the next change.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:08 +02:00
John W. Linville
36099365c7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	drivers/net/wireless/rtlwifi/pci.c
	include/linux/netlink.h
2011-06-24 15:25:51 -04:00
Linus Torvalds
5220cc9382 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  block: add REQ_SECURE to REQ_COMMON_MASK
  block: use the passed in @bdev when claiming if partno is zero
  block: Add __attribute__((format(printf...) and fix fallout
  block: make disk_block_events() properly wait for work cancellation
  block: remove non-syncing __disk_block_events() and fold it into disk_block_events()
  block: don't use non-syncing event blocking in disk_check_events()
  cfq-iosched: fix locking around ioc->ioc_data assignment
2011-06-24 08:42:35 -07:00
Timur Tabi
39785eb1d3 fsl-diu-fb: remove check for pixel clock ranges
The Freescale DIU framebuffer driver defines two constants, MIN_PIX_CLK and
MAX_PIX_CLK, that are supposed to represent the lower and upper limits of
the pixel clock.  These values, however, are true only for one platform
clock rate (533MHz) and only for the MPC8610.  So the actual range for
the pixel clock is chip-specific, which means the current values are almost
always wrong.  The chance of an out-of-range pixel clock being used are also
remote.

Rather than try to detect an out-of-range clock in the DIU driver, we depend
on the board-specific pixel clock function (e.g. p1022ds_set_pixel_clock)
to clamp the pixel clock to a supported value.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-06-24 17:08:49 +09:00
David Wagner
01dc9cc314 UBI: clarify the volume notification types' doc
I realized the new descriptions of ADDED and REMOVED could also be
misleading: they can also be triggered after using a userland util
(ubi{mk,rm}vol).

Artem: amend the commentaries

Signed-off-by David Wagner <david.wagner@free-electrons.com>
Signed-off-by: Artem Bityutskiy <dedekind1@gmail.com>
2011-06-23 17:40:07 +03:00
Linus Torvalds
68d0080f1e Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PCI / PM: Block races between runtime PM and system sleep
  PM / Domains: Update documentation
  PM / Runtime: Handle clocks correctly if CONFIG_PM_RUNTIME is unset
  PM: Fix async resume following suspend failure
  PM: Free memory bitmaps if opening /dev/snapshot fails
  PM: Rename dev_pm_info.in_suspend to is_prepared
  PM: Update documentation regarding sysdevs
  PM / Runtime: Update doc: usage count no longer incremented across system PM
2011-06-22 21:08:52 -07:00
Gabor Juhos
7d95847c9b ath9k: add external_reset callback to ath9k_platfom_data for AR9330
The patch adds a callback to ath9k_platform_data. If the
callback is provided by the platform code, then it can be
used to hard reset the WMAC device.

The callback is required for doing a hard reset of the AR9330
chips to get them working again after a hang.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-22 16:09:57 -04:00
Gabor Juhos
3762561aa8 ath9k: add MAC revision detection for AR9330
The AR9330 1.0 and 1.1 are using the same revision,
thus it is not possible to distinguish the two chips.
The platform setup code can distinguish the chips based
on the SoC revision.

Add a callback function to ath9k_platform_data in order
to allow getting the revision number from the platform code.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-22 16:09:49 -04:00
Johannes Berg
670dc2833d netlink: advertise incomplete dumps
Consider the following situation:
 * a dump that would show 8 entries, four in the first
   round, and four in the second
 * between the first and second rounds, 6 entries are
   removed
 * now the second round will not show any entry, and
   even if there is a sequence/generation counter the
   application will not know

To solve this problem, add a new flag NLM_F_DUMP_INTR
to the netlink header that indicates the dump wasn't
consistent, this flag can also be set on the MSG_DONE
message that terminates the dump, and as such above
situation can be detected.

To achieve this, add a sequence counter to the netlink
callback struct. Of course, netlink code still needs
to use this new functionality. The correct way to do
that is to always set cb->seq when a dumpit callback
is invoked and call nl_dump_check_consistent() for
each new message. The core code will also call this
function for the final MSG_DONE message.

To make it usable with generic netlink, a new function
genlmsg_nlhdr() is needed to obtain the netlink header
from the genetlink user header.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-22 16:09:45 -04:00
Tejun Heo
06d984737b ptrace: s/tracehook_tracer_task()/ptrace_parent()/
tracehook.h is on the way out.  Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:29 +02:00
Tejun Heo
4b9d33e6d8 ptrace: kill clone/exec tracehooks
At this point, tracehooks aren't useful to mainline kernel and mostly
just add an extra layer of obfuscation.  Although they have comments,
without actual in-kernel users, it is difficult to tell what are their
assumptions and they're actually trying to achieve.  To mainline
kernel, they just aren't worth keeping around.

This patch kills the following clone and exec related tracehooks.

	tracehook_prepare_clone()
	tracehook_finish_clone()
	tracehook_report_clone()
	tracehook_report_clone_complete()
	tracehook_unsafe_exec()

The changes are mostly trivial - logic is moved to the caller and
comments are merged and adjusted appropriately.

The only exception is in check_unsafe_exec() where LSM_UNSAFE_PTRACE*
are OR'd to bprm->unsafe instead of setting it, which produces the
same result as the field is always zero on entry.  It also tests
p->ptrace instead of (p->ptrace & PT_PTRACED) for consistency, which
also gives the same result.

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:29 +02:00
Tejun Heo
a288eecce5 ptrace: kill trivial tracehooks
At this point, tracehooks aren't useful to mainline kernel and mostly
just add an extra layer of obfuscation.  Although they have comments,
without actual in-kernel users, it is difficult to tell what are their
assumptions and they're actually trying to achieve.  To mainline
kernel, they just aren't worth keeping around.

This patch kills the following trivial tracehooks.

* Ones testing whether task is ptraced.  Replace with ->ptrace test.

	tracehook_expect_breakpoints()
	tracehook_consider_ignored_signal()
	tracehook_consider_fatal_signal()

* ptrace_event() wrappers.  Call directly.

	tracehook_report_exec()
	tracehook_report_exit()
	tracehook_report_vfork_done()

* ptrace_release_task() wrapper.  Call directly.

	tracehook_finish_release_task()

* noop

	tracehook_prepare_release_task()
	tracehook_report_death()

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:28 +02:00
Tejun Heo
f3c04b934d ptrace: move SIGTRAP on exec(2) logic to ptrace_event()
Move SIGTRAP on exec(2) logic from tracehook_report_exec() to
ptrace_event().  This is part of changes to make ptrace_event()
smarter and handle ptrace event related details in one place.

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:28 +02:00
Tejun Heo
643ad8388e ptrace: introduce ptrace_event_enabled() and simplify ptrace_event() and tracehook_prepare_clone()
This patch implements ptrace_event_enabled() which tests whether a
given PTRACE_EVENT_* is enabled and use it to simplify ptrace_event()
and tracehook_prepare_clone().

PT_EVENT_FLAG() macro is added which calculates PT_TRACE_* flag from
PTRACE_EVENT_*.  This is used to define PT_TRACE_* flags and by
ptrace_event_enabled() to find the matching flag.

This is used to make ptrace_event() and tracehook_prepare_clone()
simpler.

* ptrace_event() callers were responsible for providing mask to test
  whether the event was enabled.  This patch implements
  ptrace_event_enabled() and make ptrace_event() drop @mask and
  determine whether the event is enabled from @event.  Note that
  @event is constant and this conversion doesn't add runtime overhead.

  All conversions except tracehook_report_clone_complete() are
  trivial.  tracehook_report_clone_complete() used to use 0 for @mask
  (always enabled) but now tests whether the specified event is
  enabled.  This doesn't cause any behavior difference as it's
  guaranteed that the event specified by @trace is enabled.

* tracehook_prepare_clone() now only determines which event is
  applicable and use ptrace_event_enabled() for enable test.

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:28 +02:00
Tejun Heo
d21142ece4 ptrace: kill task_ptrace()
task_ptrace(task) simply dereferences task->ptrace and isn't even used
consistently only adding confusion.  Kill it and directly access
->ptrace instead.

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:27 +02:00
Alexey Dobriyan
b7f080cfe2 net: remove mm.h inclusion from netdevice.h
Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually).

To prevent mm.h inclusion via other channels also extract "enum dma_data_direction"
definition into separate header. This tiny piece is what gluing netdevice.h with mm.h
via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h".
Removal of mm.h from scatterlist.h was tried and was found not feasible
on most archs, so the link was cutoff earlier.

Hope people are OK with tiny include file.

Note, that mm_types.h is still dragged in, but it is a separate story.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 19:17:20 -07:00
Linus Torvalds
2992c4bd57 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix decode_secinfo_maxsz
  NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test
  NFSv4.1: Fix some issues with pnfs_generic_pg_test
  NFSv4.1: file layout must consider pg_bsize for coalescing
  pnfs-obj: No longer needed to take an extra ref at add_device
  SUNRPC: Ensure the RPC client only quits on fatal signals
  NFSv4: Fix a readdir regression
  nfs4.1: mark layout as bad on error path in _pnfs_return_layout
  nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout
  NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path
  NFS: (d)printks should use %zd for ssize_t arguments
  NFSv4.1: fix break condition in pnfs_find_lseg
  nfs4.1: fix several problems with _pnfs_return_layout
  NFSv4.1: allow zero fh array in filelayout decode layout
  NFSv4.1: allow nfs_fhget to succeed with mounted on fileid
  NFSv4.1: Fix a refcounting issue in the pNFS device id cache
  NFSv4.1: deprecate headerpadsz in CREATE_SESSION
  NFS41: do not update isize if inode needs layoutcommit
  NLM: Don't hang forever on NLM unlock requests
  NFS: fix umount of pnfs filesystems
2011-06-21 18:20:55 -07:00
John Fastabend
f9ae7e4b51 dcb: Add ieee_dcb_delapp() and dcb op to delete app entry
Now that we allow multiple IEEE App entries we need a way
to remove specific entries. To do this add the ieee_dcb_delapp()
routine.

Additionaly drivers may need to remove the APP entry from
their firmware tables. Add dcb ops routine to handle this.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 16:06:11 -07:00
John Fastabend
314b4778ed net: dcbnl, add multicast group for DCB
Now that dcbnl is being used in many cases by more
than a single agent it is beneficial to be notified
when some entity either driver or user space has
changed the DCB attributes.

Today applications either end up polling the interface
or relying on a user space database to maintain the DCB
state and post events. Polling is a poor solution for
obvious reasons. And relying on a user space database
has its own downside. Namely it has created strange
boot dependencies requiring the database be populated
before any applications dependent on DCB attributes
starts or the application goes into a polling loop.
Populating the database requires negotiating link
setting with the peer and can take anywhere from less
than a second up to a few seconds depending on the switch
implementation.

Perhaps more importantly if another application or an
embedded agent sets a DCB link attribute the database
has no way of knowing other than polling the kernel.
This prevents applications from responding quickly to
changes in link events which at least in the FCoE case
and probably any other protocols expecting a lossless
link may result in IO errors.

By adding a multicast group for DCB we have clean way
to disseminate kernel DCB link attributes up to user
space. Avoiding the need for user space to maintain
a coherant database and disperse events that potentially
do not reflect the current link state.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 16:06:11 -07:00
Michael Chan
f4b5ad26bc cnic, bnx2i: Add support for new devices - 57800, 57810, and 57840
And change iSCSI RQ doorbell size from 16B to 64B to match new firmware.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 16:06:11 -07:00
Alan Stern
6d0e0e84f6 PM: Fix async resume following suspend failure
The PM core doesn't handle suspend failures correctly when it comes to
asynchronously suspended devices.  These devices are moved onto the
dpm_suspended_list as soon as the corresponding async thread is
started up, and they remain on the list even if they fail to suspend
or the sleep transition is cancelled before they get suspended.  As a
result, when the PM core unwinds the transition, it tries to resume
the devices even though they were never suspended.

This patch (as1474) fixes the problem by adding a new "is_suspended"
flag to dev_pm_info.  Devices are resumed only if the flag is set.

[rjw:
 * Moved the dev->power.is_suspended check into device_resume(),
   because we need to complete dev->power.completion and clear
   dev->power.is_prepared too for devices whose
   dev->power.is_suspended flags are unset.
 * Fixed __device_suspend() to avoid setting dev->power.is_suspended
   if async_error is different from zero.]

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
2011-06-21 23:20:20 +02:00
Alan Stern
f76b168b6f PM: Rename dev_pm_info.in_suspend to is_prepared
This patch (as1473) renames the "in_suspend" field in struct
dev_pm_info to "is_prepared", in preparation for an upcoming change.
The new name is more descriptive of what the field really means.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
2011-06-21 23:19:50 +02:00
Linus Torvalds
890879cfa0 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  jbd2: Fix oops in jbd2_journal_remove_journal_head()
  jbd2: Remove obsolete parameters in the comments for some jbd2 functions
  ext4: fixed tracepoints cleanup
  ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap
  ext4: Fix max file size and logical block counting of extent format file
  ext4: correct comments for ext4_free_blocks()
2011-06-21 10:22:35 -07:00
Grant Likely
15c3597d6e dt/platform: allow device name to be overridden
Some platform code has specific requirements on the naming of devices.
This patch allows callers of of_platform_populate() to provide a
device name lookup table.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-21 11:04:10 -06:00
Grant Likely
29d4f8a497 dt: add of_platform_populate() for creating device from the device tree
of_platform_populate() is similar to of_platform_bus_probe() except
that it strictly enforces that all device nodes must have a compatible
property, and it can be used to register devices (not buses) which are
children of the root node.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-21 11:02:29 -06:00
Grant Likely
cbb49c2665 dt: Add default match table for bus ids
No need for most platforms to define their own bus table when calling
of_platform_populate().  Supply a stock one.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-21 11:02:29 -06:00
Joerg Roedel
801019d59d Merge branches 'amd/transparent-bridge' and 'core'
Conflicts:
	arch/x86/include/asm/amd_iommu_types.h
	arch/x86/kernel/amd_iommu.c

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 11:14:10 +02:00
Joerg Roedel
403f81d8ee iommu/amd: Move missing parts to drivers/iommu
A few parts of the driver were missing in drivers/iommu.
Move them there to have the complete driver in that
directory.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:31 +02:00
Ohad Ben-Cohen
166e9278a3 x86/ia64: intel-iommu: move to drivers/iommu/
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.

Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
in this patch, too.

As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.

Compile-tested on x86_64.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:30 +02:00
David S. Miller
9f6ec8d697 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
	drivers/net/wireless/rtlwifi/pci.c
	net/netfilter/ipvs/ip_vs_core.c
2011-06-20 22:29:08 -07:00
Linus Torvalds
79568f5be0 vfs: i_state needs to be 'unsigned long' for now
Commit 13e12d14e2 ("vfs: reorganize 'struct inode' layout a bit")
moved things around a bit changed i_state to be unsigned int instead of
unsigned long.  That was to help structure layout for the 64-bit case,
and shrink 'struct inode' a bit (admittedly that only happened when
spinlock debugging was on and i_flags didn't pack with i_lock).

However, Meelis Roos reports that this results in unaligned exceptions
on sprc, and it turns out that the bit-locking primitives that we use
for the I_NEW bit want to use the bitops.  Which want 'unsigned long',
not 'unsigned int'.

We really should fix the bit locking code to not have that kind of
requirement, but that's a much bigger change.  So for now, revert that
field back to 'unsigned long' (but keep the other re-ordering changes
from the commit that caused this).

Andi points out that we have played games with this in 'struct page', so
it's solvable with other hacks too, but since right now the struct inode
size advantage only happens with some rare config options, it's not
worth fighting.

It _would_ be worth fixing the bitlocking code, though.  Especially
since there is no type safety in the bitlocking code (this never caused
any warnings, and worked fine on x86-64, because the bitlocks take a
'void *' and x86-64 doesn't care that deeply about alignment).  So it's
currently a very easy problem to trigger by mistake and never notice.

Reported-by: Meelis Roos <mroos@linux.ee>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-20 20:13:49 -07:00
Linus Torvalds
eda0841094 Merge branch 'for-2.6.40' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.40' of git://linux-nfs.org/~bfields/linux:
  nfsd4: fix break_lease flags on nfsd open
  nfsd: link returns nfserr_delay when breaking lease
  nfsd: v4 support requires CRYPTO
  nfsd: fix dependency of nfsd on auth_rpcgss
2011-06-20 20:10:52 -07:00
Linus Torvalds
3669820650 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  devcgroup_inode_permission: take "is it a device node" checks to inlined wrapper
  fix comment in generic_permission()
  kill obsolete comment for follow_down()
  proc_sys_permission() is OK in RCU mode
  reiserfs_permission() doesn't need to bail out in RCU mode
  proc_fd_permission() is doesn't need to bail out in RCU mode
  nilfs2_permission() doesn't need to bail out in RCU mode
  logfs doesn't need ->permission() at all
  coda_ioctl_permission() is safe in RCU mode
  cifs_permission() doesn't need to bail out in RCU mode
  bad_inode_permission() is safe from RCU mode
  ubifs: dereferencing an ERR_PTR in ubifs_mount()
2011-06-20 20:09:15 -07:00
Benny Halevy
19345cb299 NFSv4.1: file layout must consider pg_bsize for coalescing
Otherwise we end up overflowing the rpc buffer size on the receive end.

Signed-off-by: Benny Halevy <benny@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-20 16:12:26 -04:00
Rafał Miłecki
440ca98fe8 bcma: clean exports of functions
Function managing IRQs is needed for external drivers like b43.
On the other side we do not expect writing any hosts drivers outside of
bcma, so this is safe to do not export functions related to this.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-20 15:34:19 -04:00
Linus Torvalds
c01ad40819 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: sh_keysc - 8x8 MODE_6 fix
  Input: omap-keypad - add missing input_sync()
  Input: evdev - try to wake up readers only if we have full packet
  Input: properly assign return value of clamp() macro.
2011-06-20 08:59:46 -07:00
Al Viro
482e0cd3db devcgroup_inode_permission: take "is it a device node" checks to inlined wrapper
inode_permission() calls devcgroup_inode_permission() and almost all such
calls are _not_ for device nodes; let's at least keep the common path
straight...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-20 10:46:04 -04:00
Namhyung Kim
155d109b5f block: add REQ_SECURE to REQ_COMMON_MASK
Add REQ_SECURE flag to REQ_COMMON_MASK so that
init_request_from_bio() can pass it to @req->cmd_flags.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@kernel.org # 2.6.36 and newer
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-06-20 13:23:14 +02:00
Richard Cochran
4ff75b7cf3 net: correct comment on where to place transmit time stamp hook.
The comment for the skb_tx_timestamp() function suggests calling it just
after a buffer is released to the hardware for transmission. However,
for drivers that free the buffer in an ISR, this produces a race between
the time stamp code and the ISR. This commit changes the comment to advise
placing the call just before handing the buffer over to the hardware.

Signed-off-by: Richard Cochran <richard.cochran@omicron.at>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-19 16:35:30 -07:00
Linus Torvalds
8816ead9d8 Merge branches 'perf-urgent-for-linus', 'sched-urgent-for-linus', 'timers-urgent-for-linus' and 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  tools/perf: Fix static build of perf tool
  tracing: Fix regression in printk_formats file

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  clocksource: Make watchdog robust vs. interruption
  timerfd: Fix wakeup of processes when timer is cancelled on clock change

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, MAINTAINERS: Add x86 MCE people
  x86, efi: Do not reserve boot services regions within reserved areas
2011-06-19 09:00:18 -07:00
Linus Torvalds
357ed6b1a1 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: Move RCU_BOOST #ifdefs to header file
  rcu: use softirq instead of kthreads except when RCU_BOOST=y
  rcu: Use softirq to address performance regression
  rcu: Simplify curing of load woes
2011-06-19 08:56:56 -07:00
Manoj Iyer
be98ca652f mmc: Add PCI fixup quirks for Ricoh 1180:e823 reader
Signed-off-by: Manoj Iyer <manoj.iyer@canonical.com>
Cc: <stable@kernel.org>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-06-18 22:18:18 -04:00
Magnus Damm
cca23d0b53 Input: sh_keysc - 8x8 MODE_6 fix
According to the data sheet for G4, AP4 and AG5 KEYSC MODE_6 is 8x8 keys.
Bump up MAXKEYS to 64 too.

Signed-off-by: Magnus Damm <damm@opensource.se>
Reviewed-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
2011-06-18 02:55:01 -07:00
Linus Torvalds
0835619348 Merge branches 'gpio/merge' and 'spi/merge' of git://git.secretlab.ca/git/linux-2.6
* 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6:
  gpio: add GPIOF_ values regardless on kconfig settings
  gpio: include linux/gpio.h where needed
  gpio/omap4: Fix missing interrupts during device wakeup due to IOPAD.

* 'spi/merge' of git://git.secretlab.ca/git/linux-2.6:
  spi/bfin_spi: fix handling of default bits per word setting
2011-06-17 10:36:32 -07:00
David Howells
879669961b KEYS/DNS: Fix ____call_usermodehelper() to not lose the session keyring
____call_usermodehelper() now erases any credentials set by the
subprocess_inf::init() function.  The problem is that commit
17f60a7da1 ("capabilites: allow the application of capability limits
to usermode helpers") creates and commits new credentials with
prepare_kernel_cred() after the call to the init() function.  This wipes
all keyrings after umh_keys_init() is called.

The best way to deal with this is to put the init() call just prior to
the commit_creds() call, and pass the cred pointer to init().  That
means that umh_keys_init() and suchlike can modify the credentials
_before_ they are published and potentially in use by the rest of the
system.

This prevents request_key() from working as it is prevented from passing
the session keyring it set up with the authorisation token to
/sbin/request-key, and so the latter can't assume the authority to
instantiate the key.  This causes the in-kernel DNS resolver to fail
with ENOKEY unconditionally.

Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Eric Paris <eparis@redhat.com>
Tested-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-17 09:40:48 -07:00
Takao Indoh
d8ad7d1123 generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts
There is a problem that kdump(2nd kernel) sometimes hangs up due
to a pending IPI from 1st kernel. Kernel panic occurs because IPI
comes before call_single_queue is initialized.

To fix the crash, rename init_call_single_data() to call_function_init()
and call it in start_kernel() so that call_single_queue can be
initialized before enabling interrupts.

The details of the crash are:

 (1) 2nd kernel boots up

 (2) A pending IPI from 1st kernel comes when irqs are first enabled
     in start_kernel().

 (3) Kernel tries to handle the interrupt, but call_single_queue
     is not initialized yet at this point. As a result, in the
     generic_smp_call_function_single_interrupt(), NULL pointer
     dereference occurs when list_replace_init() tries to access
     &q->list.next.

Therefore this patch changes the name of init_call_single_data()
to call_function_init() and calls it before local_irq_enable()
in start_kernel().

Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Milton Miller <miltonm@bga.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: kexec@lists.infradead.org
Link: http://lkml.kernel.org/r/D6CBEE2F420741indou.takao@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-17 10:17:12 +02:00
Tejun Heo
544b2c91a9 ptrace: implement PTRACE_LISTEN
The previous patch implemented async notification for ptrace but it
only worked while trace is running.  This patch introduces
PTRACE_LISTEN which is suggested by Oleg Nestrov.

It's allowed iff tracee is in STOP trap and puts tracee into
quasi-running state - tracee never really runs but wait(2) and
ptrace(2) consider it to be running.  While ptracer is listening,
tracee is allowed to re-enter STOP to notify an async event.
Listening state is cleared on the first notification.  Ptracer can
also clear it by issuing INTERRUPT - tracee will re-trap into STOP
with listening state cleared.

This allows ptracer to monitor group stop state without running tracee
- use INTERRUPT to put tracee into STOP trap, issue LISTEN and then
wait(2) to wait for the next group stop event.  When it happens,
PTRACE_GETSIGINFO provides information to determine the current state.

Test program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_INTERRUPT	0x4207
  #define PTRACE_LISTEN		0x4208

  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts1s = { .tv_sec = 1 };

  int main(int argc, char **argv)
  {
	  pid_t tracee, tracer;
	  int i;

	  tracee = fork();
	  if (!tracee)
		  while (1)
			  pause();

	  tracer = fork();
	  if (!tracer) {
		  siginfo_t si;

		  ptrace(PTRACE_SEIZE, tracee, NULL,
			 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
		  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
	  repeat:
		  waitid(P_PID, tracee, NULL, WSTOPPED);

		  ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
		  if (!si.si_code) {
			  printf("tracer: SIG %d\n", si.si_signo);
			  ptrace(PTRACE_CONT, tracee, NULL,
				 (void *)(unsigned long)si.si_signo);
			  goto repeat;
		  }
		  printf("tracer: stopped=%d signo=%d\n",
			 si.si_signo != SIGTRAP, si.si_signo);
		  if (si.si_signo != SIGTRAP)
			  ptrace(PTRACE_LISTEN, tracee, NULL, NULL);
		  else
			  ptrace(PTRACE_CONT, tracee, NULL, NULL);
		  goto repeat;
	  }

	  for (i = 0; i < 3; i++) {
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGSTOP\n");
		  kill(tracee, SIGSTOP);
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGCONT\n");
		  kill(tracee, SIGCONT);
	  }
	  nanosleep(&ts1s, NULL);

	  kill(tracer, SIGKILL);
	  kill(tracee, SIGKILL);
	  return 0;
  }

This is identical to the program to test TRAP_NOTIFY except that
tracee is PTRACE_LISTEN'd instead of PTRACE_CONT'd when group stopped.
This allows ptracer to monitor when group stop ends without running
tracee.

  # ./test-listen
  tracer: stopped=0 signo=5
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18

-v2: Moved JOBCTL_LISTENING check in wait_task_stopped() into
     task_stopped_code() as suggested by Oleg.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-16 21:41:54 +02:00
Tejun Heo
fb1d910c17 ptrace: implement TRAP_NOTIFY and use it for group stop events
Currently there's no way for ptracer to find out whether group stop
finished other than polling with INTERRUPT - GETSIGINFO - CONT
sequence.  This patch implements group stop notification for ptracer
using STOP traps.

When group stop state of a seized tracee changes, JOBCTL_TRAP_NOTIFY
is set, which schedules a STOP trap which is sticky - it isn't cleared
by other traps and at least one STOP trap will happen eventually.
STOP trap is synchronization point for event notification and the
tracer can determine the current group stop state by looking at the
signal number portion of exit code (si_status from waitid(2) or
si_code from PTRACE_GETSIGINFO).

Notifications are generated both on start and end of group stops but,
because group stop participation always happens before STOP trap, this
doesn't cause an extra trap while tracee is participating in group
stop.  The symmetry will be useful later.

Note that this notification works iff tracee is not trapped.
Currently there is no way to be notified of group stop state changes
while tracee is trapped.  This will be addressed by a later patch.

An example program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_INTERRUPT	0x4207

  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts1s = { .tv_sec = 1 };

  int main(int argc, char **argv)
  {
	  pid_t tracee, tracer;
	  int i;

	  tracee = fork();
	  if (!tracee)
		  while (1)
			  pause();

	  tracer = fork();
	  if (!tracer) {
		  siginfo_t si;

		  ptrace(PTRACE_SEIZE, tracee, NULL,
			 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
		  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
	  repeat:
		  waitid(P_PID, tracee, NULL, WSTOPPED);

		  ptrace(PTRACE_GETSIGINFO, tracee, NULL, &si);
		  if (!si.si_code) {
			  printf("tracer: SIG %d\n", si.si_signo);
			  ptrace(PTRACE_CONT, tracee, NULL,
				 (void *)(unsigned long)si.si_signo);
			  goto repeat;
		  }
		  printf("tracer: stopped=%d signo=%d\n",
			 si.si_signo != SIGTRAP, si.si_signo);
		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
		  goto repeat;
	  }

	  for (i = 0; i < 3; i++) {
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGSTOP\n");
		  kill(tracee, SIGSTOP);
		  nanosleep(&ts1s, NULL);
		  printf("mother: SIGCONT\n");
		  kill(tracee, SIGCONT);
	  }
	  nanosleep(&ts1s, NULL);

	  kill(tracer, SIGKILL);
	  kill(tracee, SIGKILL);
	  return 0;
  }

In the above program, tracer keeps tracee running and gets
notification of each group stop state changes.

  # ./test-notify
  tracer: stopped=0 signo=5
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18
  mother: SIGSTOP
  tracer: SIG 19
  tracer: stopped=1 signo=19
  mother: SIGCONT
  tracer: stopped=0 signo=5
  tracer: SIG 18

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-16 21:41:53 +02:00
Tejun Heo
fca26f260c ptrace: implement PTRACE_INTERRUPT
Currently, there's no way to trap a running ptracee short of sending a
signal which has various side effects.  This patch implements
PTRACE_INTERRUPT which traps ptracee without any signal or job control
related side effect.

The implementation is almost trivial.  It uses the group stop trap -
SIGTRAP | PTRACE_EVENT_STOP << 8.  A new trap flag
JOBCTL_TRAP_INTERRUPT is added, which is set on PTRACE_INTERRUPT and
cleared when any trap happens.  As INTERRUPT should be useable
regardless of the current state of tracee, task_is_traced() test in
ptrace_check_attach() is skipped for INTERRUPT.

PTRACE_INTERRUPT is available iff tracee is attached with
PTRACE_SEIZE.

Test program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_INTERRUPT	0x4207

  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts100ms = { .tv_nsec = 100000000 };
  static const struct timespec ts1s = { .tv_sec = 1 };
  static const struct timespec ts3s = { .tv_sec = 3 };

  int main(int argc, char **argv)
  {
	  pid_t tracee;

	  tracee = fork();
	  if (tracee == 0) {
		  nanosleep(&ts100ms, NULL);
		  while (1) {
			  printf("tracee: alive pid=%d\n", getpid());
			  nanosleep(&ts1s, NULL);
		  }
	  }

	  if (argc > 1)
		  kill(tracee, SIGSTOP);

	  nanosleep(&ts100ms, NULL);

	  ptrace(PTRACE_SEIZE, tracee, NULL,
		 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
	  if (argc > 1) {
		  waitid(P_PID, tracee, NULL, WSTOPPED);
		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
	  }
	  nanosleep(&ts3s, NULL);

	  printf("tracer: INTERRUPT and DETACH\n");
	  ptrace(PTRACE_INTERRUPT, tracee, NULL, NULL);
	  waitid(P_PID, tracee, NULL, WSTOPPED);
	  ptrace(PTRACE_DETACH, tracee, NULL, NULL);
	  nanosleep(&ts3s, NULL);

	  printf("tracer: exiting\n");
	  kill(tracee, SIGKILL);
	  return 0;
  }

When called without argument, tracee is seized from running state,
interrupted and then detached back to running state.

  # ./test-interrupt
  tracee: alive pid=4546
  tracee: alive pid=4546
  tracee: alive pid=4546
  tracer: INTERRUPT and DETACH
  tracee: alive pid=4546
  tracee: alive pid=4546
  tracee: alive pid=4546
  tracer: exiting

When called with argument, tracee is seized from stopped state,
continued, interrupted and then detached back to stopped state.

  # ./test-interrupt  1
  tracee: alive pid=4548
  tracee: alive pid=4548
  tracee: alive pid=4548
  tracer: INTERRUPT and DETACH
  tracer: exiting

Before PTRACE_INTERRUPT, once the tracee was running, there was no way
to trap tracee and do PTRACE_DETACH without causing side effect.

-v2: Updated to use task_set_jobctl_pending() so that it doesn't end
     up scheduling TRAP_STOP if child is dying which may make the
     child unkillable.  Spotted by Oleg.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-16 21:41:53 +02:00
Tejun Heo
3544d72a0e ptrace: implement PTRACE_SEIZE
PTRACE_ATTACH implicitly issues SIGSTOP on attach which has side
effects on tracee signal and job control states.  This patch
implements a new ptrace request PTRACE_SEIZE which attaches a tracee
without trapping it or affecting its signal and job control states.

The usage is the same with PTRACE_ATTACH but it takes PTRACE_SEIZE_*
flags in @data.  Currently, the only defined flag is
PTRACE_SEIZE_DEVEL which is a temporary flag to enable PTRACE_SEIZE.
PTRACE_SEIZE will change ptrace behaviors outside of attach itself.
The changes will be implemented gradually and the DEVEL flag is to
prevent programs which expect full SEIZE behavior from using it before
all the behavior modifications are complete while allowing unit
testing.  The flag will be removed once SEIZE behaviors are completely
implemented.

* PTRACE_SEIZE, unlike ATTACH, doesn't force tracee to trap.  After
  attaching tracee continues to run unless a trap condition occurs.

* PTRACE_SEIZE doesn't affect signal or group stop state.

* If PTRACE_SEIZE'd, group stop uses PTRACE_EVENT_STOP trap which uses
  exit_code of (signr | PTRACE_EVENT_STOP << 8) where signr is one of
  the stopping signals if group stop is in effect or SIGTRAP
  otherwise, and returns usual trap siginfo on PTRACE_GETSIGINFO
  instead of NULL.

Seizing sets PT_SEIZED in ->ptrace of the tracee.  This flag will be
used to determine whether new SEIZE behaviors should be enabled.

Test program follows.

  #define PTRACE_SEIZE		0x4206
  #define PTRACE_SEIZE_DEVEL	0x80000000

  static const struct timespec ts100ms = { .tv_nsec = 100000000 };
  static const struct timespec ts1s = { .tv_sec = 1 };
  static const struct timespec ts3s = { .tv_sec = 3 };

  int main(int argc, char **argv)
  {
	  pid_t tracee;

	  tracee = fork();
	  if (tracee == 0) {
		  nanosleep(&ts100ms, NULL);
		  while (1) {
			  printf("tracee: alive\n");
			  nanosleep(&ts1s, NULL);
		  }
	  }

	  if (argc > 1)
		  kill(tracee, SIGSTOP);

	  nanosleep(&ts100ms, NULL);

	  ptrace(PTRACE_SEIZE, tracee, NULL,
		 (void *)(unsigned long)PTRACE_SEIZE_DEVEL);
	  if (argc > 1) {
		  waitid(P_PID, tracee, NULL, WSTOPPED);
		  ptrace(PTRACE_CONT, tracee, NULL, NULL);
	  }
	  nanosleep(&ts3s, NULL);
	  printf("tracer: exiting\n");
	  return 0;
  }

When the above program is called w/o argument, tracee is seized while
running and remains running.  When tracer exits, tracee continues to
run and print out messages.

  # ./test-seize-simple
  tracee: alive
  tracee: alive
  tracee: alive
  tracer: exiting
  tracee: alive
  tracee: alive

When called with an argument, tracee is seized from stopped state and
continued, and returns to stopped state when tracer exits.

  # ./test-seize
  tracee: alive
  tracee: alive
  tracee: alive
  tracer: exiting
  # ps -el|grep test-seize
  1 T     0  4720     1  0  80   0 -   941 signal ttyS0    00:00:00 test-seize

-v2: SEIZE doesn't schedule TRAP_STOP and leaves tracee running as Jan
     suggested.

-v3: PTRACE_EVENT_STOP traps now report group stop state by signr.  If
     group stop is in effect the stop signal number is returned as
     part of exit_code; otherwise, SIGTRAP.  This was suggested by
     Denys and Oleg.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
Cc: Denys Vlasenko <vda.linux@googlemail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-16 21:41:53 +02:00
Tejun Heo
73ddff2bee job control: introduce JOBCTL_TRAP_STOP and use it for group stop trap
do_signal_stop() implemented both normal group stop and trap for group
stop while ptraced.  This approach has been enough but scheduled
changes require trap mechanism which can be used in more generic
manner and using group stop trap for generic trap site simplifies both
userland visible interface and implementation.

This patch adds a new jobctl flag - JOBCTL_TRAP_STOP.  When set, it
triggers a trap site, which behaves like group stop trap, in
get_signal_to_deliver() after checking for pending signals.  While
ptraced, do_signal_stop() doesn't stop itself.  It initiates group
stop if requested and schedules JOBCTL_TRAP_STOP and returns.  The
caller - get_signal_to_deliver() - is responsible for checking whether
TRAP_STOP is pending afterwards and handling it.

ptrace_attach() is updated to use JOBCTL_TRAP_STOP instead of
JOBCTL_STOP_PENDING and __ptrace_unlink() to clear all pending trap
bits and TRAPPING so that TRAP_STOP and future trap bits don't linger
after detach.

While at it, add proper function comment to do_signal_stop() and make
it return bool.

-v2: __ptrace_unlink() updated to clear JOBCTL_TRAP_MASK and TRAPPING
     instead of JOBCTL_PENDING_MASK.  This avoids accidentally
     clearing JOBCTL_STOP_CONSUME.  Spotted by Oleg.

-v3: do_signal_stop() updated to return %false without dropping
     siglock while ptraced and TRAP_STOP check moved inside for(;;)
     loop after group stop participation.  This avoids unnecessary
     relocking and also will help avoiding unnecessary traps by
     consuming group stop before handling pending traps.

-v4: Jobctl trap handling moved into a separate function -
     do_jobctl_trap().

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
2011-06-16 21:41:52 +02:00
Shreshtha Kumar Sahu
c16d51a32b amba pl011: workaround for uart registers lockup
This workaround aims to break the deadlock situation
which raises during continuous transfer of data for long
duration over uart with hardware flow control. It is
observed that CTS interrupt cannot be cleared in uart
interrupt register (ICR). Hence further transfer over
uart gets blocked.

It is seen that during such deadlock condition ICR
don't get cleared even on multiple write. This leads
pass_counter to decrease and finally reach zero. This
can be taken as trigger point to run this UART_BT_WA.

Workaround backups the register configuration, does soft
reset of UART using BIT-0 of PRCC_K_SOFTRST_SET/CLEAR
registers and restores the registers.

This patch also provides support for uart init and exit
function calls if present.

Signed-off-by: Shreshtha Kumar Sahu <shreshthakumar.sahu@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-16 12:01:57 -07:00
Thomas Gleixner
b5199515c2 clocksource: Make watchdog robust vs. interruption
The clocksource watchdog code is interruptible and it has been
observed that this can trigger false positives which disable the TSC.

The reason is that an interrupt storm or a long running interrupt
handler between the read of the watchdog source and the read of the
TSC brings the two far enough apart that the delta is larger than the
unstable treshold. Move both reads into a short interrupt disabled
region to avoid that.

Reported-and-tested-by: Vernon Mauery <vernux@us.ibm.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@kernel.org
2011-06-16 19:30:53 +02:00
Linus Torvalds
8dac6bee32 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  AFS: Use i_generation not i_version for the vnode uniquifier
  AFS: Set s_id in the superblock to the volume name
  vfs: Fix data corruption after failed write in __block_write_begin()
  afs: afs_fill_page reads too much, or wrong data
  VFS: Fix vfsmount overput on simultaneous automount
  fix wrong iput on d_inode introduced by e6bc45d65d
  Delay struct net freeing while there's a sysfs instance refering to it
  afs: fix sget() races, close leak on umount
  ubifs: fix sget races
  ubifs: split allocation of ubifs_info into a separate function
  fix leak in proc_set_super()
2011-06-16 10:21:59 -07:00
Jozsef Kadlecsik
15b4d93f03 netfilter: ipset: whitespace and coding fixes detected by checkpatch.pl
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 19:01:26 +02:00
Jozsef Kadlecsik
e385357a2f netfilter: ipset: hash:net,iface type introduced
The hash:net,iface type makes possible to store network address and
interface name pairs in a set. It's mostly suitable for egress
and ingress filtering. Examples:

        # ipset create test hash:net,iface
        # ipset add test 192.168.0.0/16,eth0
        # ipset add test 192.168.0.0/24,eth1

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 19:00:48 +02:00
Jozsef Kadlecsik
b66554cf03 netfilter: ipset: add xt_action_param to the variant level kadt functions, ipset API change
With the change the sets can use any parameter available for the match
and target extensions, like input/output interface. It's required for
the hash:net,iface set type.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:56:47 +02:00
Jozsef Kadlecsik
e6146e8684 netfilter: ipset: use unified from/to address masking and check the usage
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:55:58 +02:00
Jozsef Kadlecsik
c64562eaf2 netfilter: ipset: adding ranges to hash types with timeout could still fail, fixed
The patch "Fix adding ranges to hash types" had got a mistypeing
in the timeout variant of the hash types, which actually made
the patch ineffective. Fixed!

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:53:51 +02:00
Jozsef Kadlecsik
d0d9e0a5a8 netfilter: ipset: support range for IPv4 at adding/deleting elements for hash:*net* types
The range internally is converted to the network(s) equal to the range.
Example:

	# ipset new test hash:net
	# ipset add test 10.2.0.0-10.2.1.12
	# ipset list test
	Name: test
	Type: hash:net
	Header: family inet hashsize 1024 maxelem 65536
	Size in memory: 16888
	References: 0
	Members:
	10.2.1.12
	10.2.1.0/29
	10.2.0.0/24
	10.2.1.8/30

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:52:41 +02:00
Jozsef Kadlecsik
f1e00b3979 netfilter: ipset: set type support with multiple revisions added
A set type may have multiple revisions, for example when syntax is
extended. Support continuous revision ranges in set types.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:51:41 +02:00
Jozsef Kadlecsik
3d14b171f0 netfilter: ipset: fix adding ranges to hash types
When ranges are added to hash types, the elements may trigger rehashing
the set. However, the last successfully added element was not kept track
so the adding started again with the first element after the rehashing.

Bug reported by Mr Dash Four.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:49:17 +02:00
Jozsef Kadlecsik
c1e2e04388 netfilter: ipset: support listing setnames and headers too
Current listing makes possible to list sets with full content only.
The patch adds support partial listings, i.e. listing just
the existing setnames or listing set headers, without set members.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:47:07 +02:00
Jozsef Kadlecsik
ac8cc925d3 netfilter: ipset: options and flags support added to the kernel API
The support makes possible to specify the timeout value for
the SET target and a flag to reset the timeout for already existing
entries.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:42:40 +02:00
Jozsef Kadlecsik
5416219e5c netfilter: ipset: timeout can be modified for already added elements
When an element to a set with timeout added, one can change the timeout
by "readding" the element with the "-exist" flag. That means the timeout
value is reset to the specified one (or to the default from the set
specification if the "timeout n" option is not used). Example

ipset add foo 1.2.3.4 timeout 10
ipset add foo 1.2.3.4 timeout 600 -exist

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-16 18:40:55 +02:00
Christoph Lameter
3192b920bf slab, slub, slob: Unify alignment definition
Every slab has its on alignment definition in include/linux/sl?b_def.h. Extract those
and define a common set in include/linux/slab.h.

SLOB: As notes sometimes we need double word alignment on 32 bit. This gives all
structures allocated by SLOB a unsigned long long alignment like the others do.

SLAB: If ARCH_SLAB_MINALIGN is not set SLAB would set ARCH_SLAB_MINALIGN to
zero meaning no alignment at all. Give it the default unsigned long long alignment.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-06-16 19:40:20 +03:00
Randy Dunlap
c001fb72a7 gpio: add GPIOF_ values regardless on kconfig settings
Make GPIOF_ defined values available even when GPIOLIB nor GENERIC_GPIO
is enabled by moving them to <linux/gpio.h>.

Fixes these build errors in linux-next:
sound/soc/codecs/ak4641.c:524: error: 'GPIOF_OUT_INIT_LOW' undeclared (first use in this function)
sound/soc/codecs/wm8915.c:2921: error: 'GPIOF_OUT_INIT_LOW' undeclared (first use in this function)

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-16 08:40:52 -06:00
Josh Triplett
bd5dc17be8 uts: make default hostname configurable, rather than always using "(none)"
The "hostname" tool falls back to setting the hostname to "localhost" if
/etc/hostname does not exist.  Distribution init scripts have the same
fallback.  However, if userspace never calls sethostname, such as when
booting with init=/bin/sh, or otherwise booting a minimal system without
the usual init scripts, the default hostname of "(none)" remains,
unhelpfully appearing in various places such as prompts ("root@(none):~#")
and logs.  Furthermore, "(none)" doesn't typically resolve to anything
useful.

Make the default hostname configurable.  This removes the need for the
standard fallback, provides a useful default for systems that never call
sethostname, and makes minimal systems that much more useful with less
configuration.  Distributions could choose to use "localhost" here to
avoid the fallback, while embedded systems may wish to use a specific
target hostname.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: David Miller <davem@davemloft.net>
Cc: Serge Hallyn <serue@us.ibm.com>
Cc: Kel Modderman <kel@otaku42.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-15 20:04:00 -07:00
Dr. David Alan Gilbert
ca39599c63 BUILD_BUG_ON_ZERO: fix sparse breakage
BUILD_BUG_ON_ZERO and BUILD_BUG_ON_NULL must return values, even in the
CHECKER case otherwise various users of it become syntactically invalid.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-15 20:04:00 -07:00
KOSAKI Motohiro
32e45ff43e mm: increase RECLAIM_DISTANCE to 30
Recently, Robert Mueller reported (http://lkml.org/lkml/2010/9/12/236)
that zone_reclaim_mode doesn't work properly on his new NUMA server (Dual
Xeon E5520 + Intel S5520UR MB).  He is using Cyrus IMAPd and it's built on
a very traditional single-process model.

  * a master process which reads config files and manages the other
    process
  * multiple imapd processes, one per connection
  * multiple pop3d processes, one per connection
  * multiple lmtpd processes, one per connection
  * periodical "cleanup" processes.

There are thousands of independent processes.  The problem is, recent
Intel motherboard turn on zone_reclaim_mode by default and traditional
prefork model software don't work well on it.  Unfortunatelly, such models
are still typical even in the 21st century.  We can't ignore them.

This patch raises the zone_reclaim_mode threshold to 30.  30 doesn't have
any specific meaning.  but 20 means that one-hop QPI/Hypertransport and
such relatively cheap 2-4 socket machine are often used for traditional
servers as above.  The intention is that these machines don't use
zone_reclaim_mode.

Note: ia64 and Power have arch specific RECLAIM_DISTANCE definitions.
This patch doesn't change such high-end NUMA machine behavior.

Dave Hansen said:

: I know specifically of pieces of x86 hardware that set the information
: in the BIOS to '21' *specifically* so they'll get the zone_reclaim_mode
: behavior which that implies.
:
: They've done performance testing and run very large and scary benchmarks
: to make sure that they _want_ this turned on.  What this means for them
: is that they'll probably be de-optimized, at least on newer versions of
: the kernel.
:
: If you want to do this for particular systems, maybe _that_'s what we
: should do.  Have a list of specific configurations that need the
: defaults overridden either because they're buggy, or they have an
: unusual hardware configuration not really reflected in the distance
: table.

And later said:

: The original change in the hardware tables was for the benefit of a
: benchmark.  Said benchmark isn't going to get run on mainline until the
: next batch of enterprise distros drops, at which point the hardware where
: this was done will be irrelevant for the benchmark.  I'm sure any new
: hardware will just set this distance to another yet arbitrary value to
: make the kernel do what it wants.  :)
:
: Also, when the hardware got _set_ to this initially, I complained.  So, I
: guess I'm getting my way now, with this patch.  I'm cool with it.

Reported-by: Robert Mueller <robm@fastmail.fm>
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: David Rientjes <rientjes@google.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Acked-by: Dave Hansen <dave@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-15 20:03:59 -07:00
Randy Dunlap
ac5622418b kmsg_dump.h: fix build when CONFIG_PRINTK is disabled
Fix <linux/kmsg_dump.h> when CONFIG_PRINTK is not enabled:

  include/linux/kmsg_dump.h:56: error: 'EINVAL' undeclared (first use in this function)
  include/linux/kmsg_dump.h:61: error: 'EINVAL' undeclared (first use in this function)

Looks like commit 595dd3d8bf ("kmsg_dump: fix build for
CONFIG_PRINTK=n") uses EINVAL without having the needed header file(s),
but I'm sure that I build tested that patch also.  oh well.

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-15 20:03:59 -07:00
KOSAKI Motohiro
a433658c30 vmscan,memcg: memcg aware swap token
Currently, memcg reclaim can disable swap token even if the swap token mm
doesn't belong in its memory cgroup.  It's slightly risky.  If an admin
creates very small mem-cgroup and silly guy runs contentious heavy memory
pressure workload, every tasks are going to lose swap token and then
system may become unresponsive.  That's bad.

This patch adds 'memcg' parameter into disable_swap_token().  and if the
parameter doesn't match swap token, VM doesn't disable it.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Reviewed-by: Rik van Riel<riel@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-15 20:03:59 -07:00
Michael Hennerich
a59ec1e7ff backlight: new driver for the ADP8870 backlight devices
Signed-off-by: Michael Hennerich <michael.hennerich@analog.com>
Signed-off-by: Mike Frysinger <vapier@gentoo.org>
Cc: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-15 20:03:59 -07:00
Grant Likely
2bc7c85210 Merge branch 'gpio/next-tegra' into gpio/next
Conflicts:
	drivers/gpio/Kconfig
	drivers/gpio/Makefile
2011-06-15 14:57:39 -06:00
Benny Halevy
c9c30dd5f7 NFSv4.1: deprecate headerpadsz in CREATE_SESSION
We don't support header padding yet so better off ditching it

Reported-by: Sid Moore <learnmost@gmail.com>
Signed-off-by: Benny Halevy <benny@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-15 11:24:28 -04:00
Trond Myklebust
0b760113a3 NLM: Don't hang forever on NLM unlock requests
If the NLM daemon is killed on the NFS server, we can currently end up
hanging forever on an 'unlock' request, instead of aborting. Basically,
if the rpcbind request fails, or the server keeps returning garbage, we
really want to quit instead of retrying.

Tested-by: Vasily Averin <vvs@sw.ru>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Cc: stable@kernel.org
2011-06-15 11:24:27 -04:00
Matt Carlson
221c56373e tg3: Migrate phy preprocessor defs to system defs
This patch changes to code to use some of the preprocessor
definitions from mii.h over its homegrown equivalents.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Michael Chan <mchan@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@conan.davemloft.net>
2011-06-15 11:11:57 -04:00
Shaohua Li
09223371de rcu: Use softirq to address performance regression
Commit a26ac2455ffcf3(rcu: move TREE_RCU from softirq to kthread)
introduced performance regression. In an AIM7 test, this commit degraded
performance by about 40%.

The commit runs rcu callbacks in a kthread instead of softirq. We observed
high rate of context switch which is caused by this. Out test system has
64 CPUs and HZ is 1000, so we saw more than 64k context switch per second
which is caused by RCU's per-CPU kthread.  A trace showed that most of
the time the RCU per-CPU kthread doesn't actually handle any callbacks,
but instead just does a very small amount of work handling grace periods.
This means that RCU's per-CPU kthreads are making the scheduler do quite
a bit of work in order to allow a very small amount of RCU-related
processing to be done.

Alex Shi's analysis determined that this slowdown is due to lock
contention within the scheduler.  Unfortunately, as Peter Zijlstra points
out, the scheduler's real-time semantics require global action, which
means that this contention is inherent in real-time scheduling.  (Yes,
perhaps someone will come up with a workaround -- otherwise, -rt is not
going to do well on large SMP systems -- but this patch will work around
this issue in the meantime.  And "the meantime" might well be forever.)

This patch therefore re-introduces softirq processing to RCU, but only
for core RCU work.  RCU callbacks are still executed in kthread context,
so that only a small amount of RCU work runs in softirq context in the
common case.  This should minimize ksoftirqd execution, allowing us to
skip boosting of ksoftirqd for CONFIG_RCU_BOOST=y kernels.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Tested-by: "Alex,Shi" <alex.shi@intel.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2011-06-14 15:25:39 -07:00
Laura Abbott
74315cccd2 iommu-api: Add missing header file
If CONFIG_IOMMU_API is not defined some functions will just
return -ENODEV. Add errno.h for the definition of ENODEV.

Signed-off-by: Laura Abbott <lauraa@codeaurora.org>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-14 11:30:22 +02:00
Linus Torvalds
40779859de Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  SLAB: Record actual last user of freed objects.
  slub: always align cpu_slab to honor cmpxchg_double requirement
2011-06-13 13:00:53 -07:00
Jan Kara
de1b794130 jbd2: Fix oops in jbd2_journal_remove_journal_head()
jbd2_journal_remove_journal_head() can oops when trying to access
journal_head returned by bh2jh(). This is caused for example by the
following race:

	TASK1					TASK2
  jbd2_journal_commit_transaction()
    ...
    processing t_forget list
      __jbd2_journal_refile_buffer(jh);
      if (!jh->b_transaction) {
        jbd_unlock_bh_state(bh);
					jbd2_journal_try_to_free_buffers()
					  jbd2_journal_grab_journal_head(bh)
					  jbd_lock_bh_state(bh)
					  __journal_try_to_free_buffer()
					  jbd2_journal_put_journal_head(jh)
        jbd2_journal_remove_journal_head(bh);

jbd2_journal_put_journal_head() in TASK2 sees that b_jcount == 0 and
buffer is not part of any transaction and thus frees journal_head
before TASK1 gets to doing so. Note that even buffer_head can be
released by try_to_free_buffers() after
jbd2_journal_put_journal_head() which adds even larger opportunity for
oops (but I didn't see this happen in reality).

Fix the problem by making transactions hold their own journal_head
reference (in b_jcount). That way we don't have to remove journal_head
explicitely via jbd2_journal_remove_journal_head() and instead just
remove journal_head when b_jcount drops to zero. The result of this is
that [__]jbd2_journal_refile_buffer(),
[__]jbd2_journal_unfile_buffer(), and
__jdb2_journal_remove_checkpoint() can free journal_head which needs
modification of a few callers. Also we have to be careful because once
journal_head is removed, buffer_head might be freed as well. So we
have to get our own buffer_head reference where it matters.

Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2011-06-13 15:38:22 -04:00
Joe Perches
08e8138ade block: Add __attribute__((format(printf...) and fix fallout
Use the compiler to verify format strings and arguments.

Fix fallout.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-06-13 10:42:49 +02:00
Al Viro
a685e08987 Delay struct net freeing while there's a sysfs instance refering to it
* new refcount in struct net, controlling actual freeing of the memory
	* new method in kobj_ns_type_operations (->drop_ns())
	* ->current_ns() semantics change - it's supposed to be followed by
corresponding ->drop_ns().  For struct net in case of CONFIG_NET_NS it bumps
the new refcount; net_drop_ns() decrements it and calls net_free() if the
last reference has been dropped.  Method renamed to ->grab_current_ns().
	* old net_free() callers call net_drop_ns() instead.
	* sysfs_exit_ns() is gone, along with a large part of callchain
leading to it; now that the references stored in ->ns[...] stay valid we
do not need to hunt them down and replace them with NULL.  That fixes
problems in sysfs_lookup() and sysfs_readdir(), along with getting rid
of sb->s_instances abuse.

	Note that struct net *shutdown* logics has not changed - net_cleanup()
is called exactly when it used to be called.  The only thing postponed by
having a sysfs instance refering to that struct net is actual freeing of
memory occupied by struct net.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-12 17:45:41 -04:00
Linus Torvalds
08d63aac43 Merge branch 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6
* 'gpio/merge' of git://git.secretlab.ca/git/linux-2.6:
  gpio/basic_mmio: add missing include of spinlock_types.h
  gpio/nomadik: fix sleepmode for elder Nomadik
2011-06-12 11:03:29 -07:00
Linus Torvalds
152b92db7a Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (55 commits)
  ISDN, hfcsusb: Don't leak in hfcsusb_ph_info()
  netpoll: call dev_put() on error in netpoll_setup()
  net: ep93xx_eth: fix DMA API violations
  net: ep93xx_eth: drop GFP_DMA from call to dma_alloc_coherent()
  net: ep93xx_eth: allocate buffers using kmalloc()
  net: ep93xx_eth: pass struct device to DMA API functions
  ep93xx: set DMA masks for the ep93xx_eth
  vlan: Fix the ingress VLAN_FLAG_REORDER_HDR check
  dl2k: EEPROM CRC calculation wrong endianess on bigendian machine
  NET: am79c961: fix assembler warnings
  NET: am79c961: ensure multicast filter is correctly set at open
  NET: am79c961: ensure asm() statements are marked volatile
  ethtool.h: fix typos
  ep93xx_eth: Update MAINTAINERS
  ipv4: Fix packet size calculation for raw IPsec packets in __ip_append_data
  netpoll: prevent netpoll setup on slave devices
  net: pmtu_expires fixes
  gianfar:localized filer table
  iwlegacy: fix channel switch locking
  mac80211: fix IBSS teardown race
  ...
2011-06-12 11:03:08 -07:00
Jiri Pirko
0b5c9db1b1 vlan: Fix the ingress VLAN_FLAG_REORDER_HDR check
Testing of VLAN_FLAG_REORDER_HDR does not belong in vlan_untag
but rather in vlan_do_receive.  Otherwise the vlan header
will not be properly put on the packet in the case of
vlan header accelleration.

As we remove the check from vlan_check_reorder_header
rename it vlan_reorder_header to keep the naming clean.

Fix up the skb->pkt_type early so we don't look at the packet
after adding the vlan tag, which guarantees we don't goof
and look at the wrong field.

Use a simple if statement instead of a complicated switch
statement to decided that we need to increment rx_stats
for a multicast packet.

Hopefully at somepoint we will just declare the case where
VLAN_FLAG_REORDER_HDR is cleared as unsupported and remove
the code.  Until then this keeps it working correctly.

Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11 16:15:50 -07:00
Jason Wang
10a8d94a95 virtio_net: introduce VIRTIO_NET_HDR_F_DATA_VALID
There's no need for the guest to validate the checksum if it have been
validated by host nics. So this patch introduces a new flag -
VIRTIO_NET_HDR_F_DATA_VALID which is used to bypass the checksum
examing in guest. The backend (tap/macvtap) may set this flag when
met skbs with CHECKSUM_UNNECESSARY to save cpu utilization.

No feature negotiation is needed as old driver just ignore this flag.

Iperf shows 12%-30% performance improvement for UDP traffic. For TCP,
when gro is on no difference as it produces skb with partial
checksum. But when gro is disabled, 20% or even higher improvement
could be measured by netperf.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-11 15:57:47 -07:00
David Howells
56a210526a linux/seqlock.h should #include asm/processor.h for cpu_relax()
It uses cpu_relax(), and so needs <asm/processor.h>

Without this patch, I see:

   CC      arch/mn10300/kernel/asm-offsets.s
  In file included from include/linux/time.h:8,
                   from include/linux/timex.h:56,
                   from include/linux/sched.h:57,
                   from arch/mn10300/kernel/asm-offsets.c:7:
  include/linux/seqlock.h: In function 'read_seqbegin':
  include/linux/seqlock.h:91: error: implicit declaration of function 'cpu_relax'

whilst building asb2364_defconfig on MN10300.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-11 13:17:28 -07:00
Arend van Spriel
e3ae0cac00 drivers: bcma: export bcma_core_disable() function
In the brcm80211 driver we disable the 80211 core when the driver is
'down'. The bcma_core_disable() function exactly does the same as
our implementation so exporting this function makes sense.

Cc: linux-wireless@vger.kernel.org
Cc: Rafal Milecki <zajec5@gmail.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-10 14:57:53 -04:00
Jamie Iles
e5ea3f12d4 gpio/basic_mmio: add missing include of spinlock_types.h
include/linux/basic_mmio_gpio.h uses a spinlock_t without including any
of the spinlock headers resulting in this compiler warning.

include/linux/basic_mmio_gpio.h:51:2: error: expected specifier-qualifier-list before 'spinlock_t'

Explicitly include linux/spinlock_types.h to fix it.

Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-10 08:46:26 -06:00
Greg Rose
c7ac8679be rtnetlink: Compute and store minimum ifinfo dump size
The message size allocated for rtnl ifinfo dumps was limited to
a single page.  This is not enough for additional interface info
available with devices that support SR-IOV and caused a bug in
which VF info would not be displayed if more than approximately
40 VFs were created per interface.

Implement a new function pointer for the rtnl_register service that will
calculate the amount of data required for the ifinfo dump and allocate
enough data to satisfy the request.

Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2011-06-09 20:38:07 -07:00
Yegor Yefremov
b4c8cc88c1 ethtool.h: fix typos
Signed-off-by: Yegor Yefremov <yegorslists@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-09 15:05:48 -07:00
Linus Torvalds
361932bf84 Merge branch 'stable/xen-swiotlb.bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6
* 'stable/xen-swiotlb.bugfix' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb-2.6:
  swiotlb: Export swioltb_nr_tbl and utilize it as appropiate.
2011-06-09 12:52:44 -07:00
Jerry Chu
9ad7c049f0 tcp: RFC2988bis + taking RTT sample from 3WHS for the passive open side
This patch lowers the default initRTO from 3secs to 1sec per
RFC2988bis. It falls back to 3secs if the SYN or SYN-ACK packet
has been retransmitted, AND the TCP timestamp option is not on.

It also adds support to take RTT sample during 3WHS on the passive
open side, just like its active open counterpart, and uses it, if
valid, to seed the initRTO for the data transmission phase.

The patch also resets ssthresh to its initial default at the
beginning of the data transmission phase, and reduces cwnd to 1 if
there has been MORE THAN ONE retransmission during 3WHS per RFC5681.

Signed-off-by: H.K. Jerry Chu <hkchu@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 17:05:30 -07:00
Alexander Duyck
bff55273f9 v2 ethtool: remove support for ETHTOOL_GRXNTUPLE
This change is meant to remove all support for displaying an ntuple as
strings via ETHTOOL_GRXNTUPLE.  The reason for this change is due to the
fact that multiple issues have been found including:
 - Multiple buffer overruns for strings being displayed.
 - Incorrect filters displayed, cleared filters with ring of -2 are displayed
 - Setting get_rx_ntuple displays no rules if defined.
 - Endianess wrong on displayed values.
 - Hard limit of 1024 filters makes display functionality extremely limited

The only driver that had supported this interface was ixgbe.  Since it no
longer uses the interface and due to the issues mentioned above I am
submitting this patch to remove it.

v2:
Updated based on comments from Ben Hutchings
 - Left ETH_SS_NTUPLE_FILTERS in code but commented on it being deprecated
 - Removed ethtool_rx_ntuple_list and ethtool_rx_ntuple_flow_spec_container
 - Left ETHTOOL_GRXNTUPLE but commented it as deprecated

Also cleaned up set_rx_ntuple since there is no flow spec container to
maintain we can drop all the code for the alloc and free of it and just
return ops->set_rx_ntuple().
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-08 16:45:31 -07:00
Linus Torvalds
13e12d14e2 vfs: reorganize 'struct inode' layout a bit
This tries to make the 'struct inode' accesses denser in the data cache
by moving a commonly accessed field (i_security) closer to other fields
that are accessed often.

It also makes 'i_state' just an 'unsigned int' rather than 'unsigned
long', since we only use a few bits of that field, and moves it next to
the existing 'i_flags' so that we potentially get better structure
layout (although depending on config options, i_flags may already have
packed in the same word as i_lock, so this improves packing only for the
case of spinlock debugging)

Out 'struct inode' is still way too big, and we should probably move
some other fields around too (the acl fields in particular) for better
data cache access density.  Other fields (like the inode hash) are
likely to be entirely irrelevant under most loads.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-08 15:18:19 -07:00
Greg Kroah-Hartman
7e24cf43f7 Merge 3.0-rc2 + Linus's latest into usb-linus
This is needed to get the following MAINTAINERS patch to apply properly.

Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-08 13:50:35 -07:00
John W. Linville
c0c33addcb Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-06-08 13:44:21 -04:00
Linus Torvalds
33726bf214 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf: Fix comments in include/linux/perf_event.h
  perf: Comment /proc/sys/kernel/perf_event_paranoid to be part of user ABI
  perf python: Fix argument name list of read_on_cpu()
  perf evlist: Don't die if sample_{id_all|type} is invalid
  perf python: Use exception to propagate errors
  perf evlist: Remove dependency on debug routines
  perf, cgroups: Fix up for new API
2011-06-08 08:36:15 -07:00
Linus Torvalds
cb0a02ecf9 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  genirq: Ensure we locate the passed IRQ in irq_alloc_descs()
  genirq: Fix descriptor init on non-sparse IRQs
  irq: Handle spurios irq detection for threaded irqs
  genirq: Print threaded handler in spurious debug output
2011-06-07 19:21:11 -07:00
Linus Torvalds
6715a52a58 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Fix/clarify set_task_cpu() locking rules
  lockdep: Fix lock_is_held() on recursion
  sched: Fix schedstat.nr_wakeups_migrate
  sched: Fix cross-cpu clock sync on remote wakeups
2011-06-07 19:20:28 -07:00
Linus Torvalds
8397345172 Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  vfs: make unlink() and rmdir() return ENOENT in preference to EROFS
  lmLogOpen() broken failure exit
  usb: remove bad dput after dentry_unhash
  more conservative S_NOSEC handling
2011-06-07 18:36:59 -07:00
Benjamin Herrenschmidt
ef3b4f8cc2 pci/of: Consolidate pci_bus_to_OF_node()
The generic code always get the device-node in the right place now
so a single implementation will work for all archs

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:57 +10:00
Benjamin Herrenschmidt
64099d981c pci/of: Consolidate pci_device_to_OF_node()
All archs do more or less the same thing now, move it into
a single generic place.

I chose pci.h rather than of_pci.h to avoid having to change
all call-sites to include the later.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:43 +10:00
Benjamin Herrenschmidt
98d9f30c82 pci/of: Match PCI devices to OF nodes dynamically
powerpc has two different ways of matching PCI devices to their
corresponding OF node (if any) for historical reasons. The ppc64 one
does a scan looking for matching bus/dev/fn, while the ppc32 one does a
scan looking only for matching dev/fn on each level in order to be
agnostic to busses being renumbered (which Linux does on some
platforms).

This removes both and instead moves the matching code to the PCI core
itself. It's the most logical place to do it: when a pci_dev is created,
we know the parent and thus can do a single level scan for the matching
device_node (if any).

The benefit is that all archs now get the matching for free. There's one
hook the arch might want to provide to match a PHB bus to its device
node. A default weak implementation is provided that looks for the
parent device device node, but it's not entirely reliable on powerpc for
various reasons so powerpc provides its own.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Michal Simek <monstr@monstr.eu>
Acked-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2011-06-08 09:08:17 +10:00
K. Y. Srinivasan
ea2c00095c Connector: Set the CN_NETLINK_USERS correctly
The CN_NETLINK_USERS must be set to the highest valid index +1.
Thanks to Evgeniy for pointing this out.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: stable <stable@kernel.org> [.39]
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-07 12:02:00 -07:00
John W. Linville
41bfce8ede Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 2011-06-07 14:07:11 -04:00
Alan Stern
21c13a4f7b usb-storage: redo incorrect reads
Some USB mass-storage devices have bugs that cause them not to handle
the first READ(10) command they receive correctly.  The Corsair
Padlock v2 returns completely bogus data for its first read (possibly
it returns the data in encrypted form even though the device is
supposed to be unlocked).  The Feiya SD/SDHC card reader fails to
complete the first READ(10) command after it is plugged in or after a
new card is inserted, returning a status code that indicates it thinks
the command was invalid, which prevents the kernel from retrying the
read.

Since the first read of a new device or a new medium is for the
partition sector, the kernel is unable to retrieve the device's
partition table.  Users have to manually issue an "hdparm -z" or
"blockdev --rereadpt" command before they can access the device.

This patch (as1470) works around the problem.  It adds a new quirk
flag, US_FL_INVALID_READ10, indicating that the first READ(10) should
always be retried immediately, as should any failing READ(10) commands
(provided the preceding READ(10) command succeeded, to avoid getting
stuck in a loop).  The patch also adds appropriate unusual_devs
entries containing the new flag.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Tested-by: Sven Geggus <sven-usbst@geggus.net>
Tested-by: Paul Hartman <paul.hartman+linux@gmail.com>
CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
CC: <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-07 09:05:42 -07:00
Tomoki Sekiyama
6dc1418e13 HID: yurex: recognize GeneralKeys wireless presenter as generic HID
Unfortunately, the device seems to have the same Vendor ID and Product ID
as YUREX leg-shakes sensors, and the commit 6bc235a2e2 ("USB: add driver
for Meywa-Denki & Kayac YUREX") added the ID to hid_ignore_list.

I believe that we can distinguish YUREX and the Wireless Presenter by
device type.  The patch below makes the driver ignore only YUREX
(bInterfaceProtocol==0), and recognize Wireless Presenter
(bInterfaceProtocol is keyboard or mouse) as generic HID.  (I don't have
the Wireless Presenter, so not yet ested.)

** YUREX lsusb information:
Bus 002 Device 007: ID 0c45:1010 Microdia
Device Descriptor:
   bLength                18
   bDescriptorType         1
   bcdUSB               1.10
   bDeviceClass            0 (Defined at Interface level)
   bDeviceSubClass         0
   bDeviceProtocol         0
   bMaxPacketSize0         8
   idVendor           0x0c45 Microdia
   idProduct          0x1010
   bcdDevice            0.03
   iManufacturer           1 JESS
   iProduct                2 YUREX
   iSerial                 3 10000269
   bNumConfigurations      1
   Configuration Descriptor:
     bLength                 9
     bDescriptorType         2
     wTotalLength           34
     bNumInterfaces          1
     bConfigurationValue     1
     iConfiguration          0
     bmAttributes         0xa0
       (Bus Powered)
       Remote Wakeup
     MaxPower              100mA
     Interface Descriptor:
       bLength                 9
       bDescriptorType         4
       bInterfaceNumber        0
       bAlternateSetting       0
       bNumEndpoints           1
       bInterfaceClass         3 Human Interface Device
       bInterfaceSubClass      1 Boot Interface Subclass
       bInterfaceProtocol      0 None
       iInterface              0
         HID Device Descriptor:
           bLength                 9
           bDescriptorType        33
           bcdHID               1.10
           bCountryCode            0 Not supported
           bNumDescriptors         1
           bDescriptorType        34 Report
           wDescriptorLength      31
          Report Descriptors:
            ** UNAVAILABLE **
       Endpoint Descriptor:
         bLength                 7
         bDescriptorType         5
         bEndpointAddress     0x81  EP 1 IN
         bmAttributes            3
           Transfer Type            Interrupt
           Synch Type               None
           Usage Type               Data
         wMaxPacketSize     0x0008  1x 8 bytes
         bInterval              10
Device Status:     0x0002
   (Bus Powered)
   Remote Wakeup Enabled

Addresses https://bugzilla.kernel.org/show_bug.cgi?id=26922

Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@gmail.com>
Cc: Greg KH <gregkh@suse.de>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Cc: Maciej Rutecki <maciej.rutecki@gmail.com>
Reported-by: Thomas B?chler <thomas@archlinux.org>
Tested-by: Thomas B?chler <thomas@archlinux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-06-07 15:34:17 +02:00
Alexey Dobriyan
a6b7a40786 net: remove interrupt.h inclusion from netdevice.h
* remove interrupt.g inclusion from netdevice.h -- not needed
* fixup fallout, add interrupt.h and hardirq.h back where needed.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 22:55:11 -07:00
Eric Dumazet
13fcb7bd32 af_packet: prevent information leak
In 2.6.27, commit 393e52e33c (packet: deliver VLAN TCI to userspace)
added a small information leak.

Add padding field and make sure its zeroed before copy to user.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 22:42:06 -07:00
David S. Miller
3019de124b net: Rework netdev_drivername() to avoid warning.
This interface uses a temporary buffer, but for no real reason.
And now can generate warnings like:

net/sched/sch_generic.c: In function dev_watchdog
net/sched/sch_generic.c:254:10: warning: unused variable drivername

Just return driver->name directly or "".

Reported-by: Connor Hansen <cmdkhh@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-06 16:41:33 -07:00
FUJITA Tomonori
5f98ecdbce swiotlb: Export swioltb_nr_tbl and utilize it as appropiate.
By default the io_tlb_nslabs is set to zero, and gets set to
whatever value is passed in via swiotlb_init_with_tbl function.
The default value passed in is 64MB. However, if the user provides
the 'swiotlb=<nslabs>' the default value is ignored and
the value provided by the user is used... Except when the SWIOTLB
is used under Xen - there the default value of 64MB is used and
the Xen-SWIOTLB has no mechanism to get the 'io_tlb_nslabs' filled
out by setup_io_tlb_npages functions. This patch provides a function
for the Xen-SWIOTLB to call to see if the io_tlb_nslabs is set
and if so use that value.

Signed-off-by: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-06-06 15:41:16 -04:00
J. Bruce Fields
b084f598df nfsd: fix dependency of nfsd on auth_rpcgss
Commit b0b0c0a26e "nfsd: add proc file listing kernel's gss_krb5
enctypes" added an nunnecessary dependency of nfsd on the auth_rpcgss
module.

It's a little ad hoc, but since the only piece of information nfsd needs
from rpcsec_gss_krb5 is a single static string, one solution is just to
share it with an include file.

Cc: stable@kernel.org
Reported-by: Michael Guntsche <mike@it-loops.com>
Cc: Kevin Coffman <kwc@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-06-06 15:07:15 -04:00
Grant Likely
8c31b1635b Merge branch 'gpio/next-mx' into gpio/next 2011-06-06 10:10:07 -06:00
Eric Dumazet
fb04883371 netfilter: add more values to enum ip_conntrack_info
Following error is raised (and other similar ones) :

net/ipv4/netfilter/nf_nat_standalone.c: In function ‘nf_nat_fn’:
net/ipv4/netfilter/nf_nat_standalone.c:119:2: warning: case value ‘4’
not in enumerated type ‘enum ip_conntrack_info’

gcc barfs on adding two enum values and getting a not enumerated
result :

case IP_CT_RELATED+IP_CT_IS_REPLY:

Add missing enum values

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: David Miller <davem@davemloft.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2011-06-06 01:35:10 +02:00
Tejun Heo
dd1d677269 signal: remove three noop tracehooks
Remove the following three noop tracehooks in signals.c.

* tracehook_force_sigpending()
* tracehook_get_signal()
* tracehook_finish_jctl()

The code area is about to be updated and these hooks don't do anything
other than obfuscating the logic.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04 18:17:11 +02:00
Tejun Heo
7dd3db54e7 job control: introduce task_set_jobctl_pending()
task->jobctl currently hosts JOBCTL_STOP_PENDING and will host TRAP
pending bits too.  Setting pending conditions on a dying task may make
the task unkillable.  Currently, each setting site is responsible for
checking for the condition but with to-be-added job control traps this
becomes too fragile.

This patch adds task_set_jobctl_pending() which should be used when
setting task->jobctl bits to schedule a stop or trap.  The function
performs the followings to ease setting pending bits.

* Sanity checks.

* If fatal signal is pending or PF_EXITING is set, no bit is set.

* STOP_SIGMASK is automatically cleared if new value is being set.

do_signal_stop() and ptrace_attach() are updated to use
task_set_jobctl_pending() instead of setting STOP_PENDING explicitly.
The surrounding structures around setting are changed to fit
task_set_jobctl_pending() better but there should be no userland
visible behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04 18:17:11 +02:00
Tejun Heo
3759a0d94c job control: introduce JOBCTL_PENDING_MASK and task_clear_jobctl_pending()
This patch introduces JOBCTL_PENDING_MASK and replaces
task_clear_jobctl_stop_pending() with task_clear_jobctl_pending()
which takes an extra @mask argument.

JOBCTL_PENDING_MASK is currently equal to JOBCTL_STOP_PENDING but
future patches will add more bits.  recalc_sigpending_tsk() is updated
to use JOBCTL_PENDING_MASK instead.

task_clear_jobctl_pending() takes @mask which in subset of
JOBCTL_PENDING_MASK and clears the relevant jobctl bits.  If
JOBCTL_STOP_PENDING is set, other STOP bits are cleared together.  All
task_clear_jobctl_stop_pending() users are updated to call
task_clear_jobctl_pending() with JOBCTL_STOP_PENDING which is
functionally identical to task_clear_jobctl_stop_pending().

This patch doesn't cause any functional change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04 18:17:10 +02:00
Tejun Heo
755e276b33 ptrace: ptrace_check_attach(): rename @kill to @ignore_state and add comments
PTRACE_INTERRUPT is going to be added which should also skip
task_is_traced() check in ptrace_check_attach().  Rename @kill to
@ignore_state and make it bool.  Add function comment while at it.

This patch doesn't introduce any behavior difference.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04 18:17:10 +02:00
Tejun Heo
a8f072c1d6 job control: rename signal->group_stop and flags to jobctl and update them
signal->group_stop currently hosts mostly group stop related flags;
however, it's gonna be used for wider purposes and the GROUP_STOP_
flag prefix becomes confusing.  Rename signal->group_stop to
signal->jobctl and rename all GROUP_STOP_* flags to JOBCTL_*.

Bit position macros JOBCTL_*_BIT are defined and JOBCTL_* flags are
defined in terms of them to allow using bitops later.

While at it, reassign JOBCTL_TRAPPING to bit 22 to better accomodate
future additions.

This doesn't cause any functional change.

-v2: JOBCTL_*_BIT macros added as suggested by Linus.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-04 18:17:09 +02:00
Linus Torvalds
0e833d8cfc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (40 commits)
  tg3: Fix tg3_skb_error_unmap()
  net: tracepoint of net_dev_xmit sees freed skb and causes panic
  drivers/net/can/flexcan.c: add missing clk_put
  net: dm9000: Get the chip in a known good state before enabling interrupts
  drivers/net/davinci_emac.c: add missing clk_put
  af-packet: Add flag to distinguish VID 0 from no-vlan.
  caif: Fix race when conditionally taking rtnl lock
  usbnet/cdc_ncm: add missing .reset_resume hook
  vlan: fix typo in vlan_dev_hard_start_xmit()
  net/ipv4: Check for mistakenly passed in non-IPv4 address
  iwl4965: correctly validate temperature value
  bluetooth l2cap: fix locking in l2cap_global_chan_by_psm
  ath9k: fix two more bugs in tx power
  cfg80211: don't drop p2p probe responses
  Revert "net: fix section mismatches"
  drivers/net/usb/catc.c: Fix potential deadlock in catc_ctrl_run()
  sctp: stop pending timers and purge queues when peer restart asoc
  drivers/net: ks8842 Fix crash on received packet when in PIO mode.
  ip_options_compile: properly handle unaligned pointer
  iwlagn: fix incorrect PCI subsystem id for 6150 devices
  ...
2011-06-04 23:16:00 +09:00
Vince Weaver
d7ebe75b06 perf: Fix comments in include/linux/perf_event.h
Fix include/linux/perf_event.h comments to be consistent with
the actual #define names. This is trivial, but it can be a bit
confusing when first  reading through the file.

Signed-off-by: Vince Weaver <vweaver1@eecs.utk.edu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: paulus@samba.org
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.00.1106031757090.29381@cl320.eecs.utk.edu
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-06-04 12:31:14 +02:00
Linus Torvalds
4f1ba49efa Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  block: Use hlist_entry() for io_context.cic_list.first
  cfq-iosched: Remove bogus check in queue_fail path
  xen/blkback: potential null dereference in error handling
  xen/blkback: don't call vbd_size() if bd_disk is NULL
  block: blkdev_get() should access ->bd_disk only after success
  CFQ: Fix typo and remove unnecessary semicolon
  block: remove unwanted semicolons
  Revert "block: Remove extra discard_alignment from hd_struct."
  nbd: adjust 'max_part' according to part_shift
  nbd: limit module parameters to a sane value
  nbd: pass MSG_* flags to kernel_recvmsg()
  block: improve the bio_add_page() and bio_add_pc_page() descriptions
2011-06-04 08:11:26 +09:00
Al Viro
9e1f1de02c more conservative S_NOSEC handling
Caching "we have already removed suid/caps" was overenthusiastic as merged.
On network filesystems we might have had suid/caps set on another client,
silently picked by this client on revalidate, all of that *without* clearing
the S_NOSEC flag.

AFAICS, the only reasonably sane way to deal with that is
	* new superblock flag; unless set, S_NOSEC is not going to be set.
	* local block filesystems set it in their ->mount() (more accurately,
mount_bdev() does, so does btrfs ->mount(), users of mount_bdev() other than
local block ones clear it)
	* if any network filesystem (or a cluster one) wants to use S_NOSEC,
it'll need to set MS_NOSEC in sb->s_flags *AND* take care to clear S_NOSEC when
inode attribute changes are picked from other clients.

It's not an earth-shattering hole (anybody that can set suid on another client
will almost certainly be able to write to the file before doing that anyway),
but it's a bug that needs fixing.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-06-03 18:24:58 -04:00
Linus Torvalds
55db4c64ed Revert "tty: make receive_buf() return the amout of bytes received"
This reverts commit b1c43f82c5.

It was broken in so many ways, and results in random odd pty issues.

It re-introduced the buggy schedule_work() in flush_to_ldisc() that can
cause endless work-loops (see commit a5660b41af: "tty: fix endless
work loop when the buffer fills up").

It also used an "unsigned int" return value fo the ->receive_buf()
function, but then made multiple functions return a negative error code,
and didn't actually check for the error in the caller.

And it didn't actually work at all.  BenH bisected down odd tty behavior
to it:
  "It looks like the patch is causing some major malfunctions of the X
   server for me, possibly related to PTYs.  For example, cat'ing a
   large file in a gnome terminal hangs the kernel for -minutes- in a
   loop of what looks like flush_to_ldisc/workqueue code, (some ftrace
   data in the quoted bits further down).

   ...

   Some more data: It -looks- like what happens is that the
   flush_to_ldisc work queue entry constantly re-queues itself (because
   the PTY is full ?) and the workqueue thread will basically loop
   forver calling it without ever scheduling, thus starving the consumer
   process that could have emptied the PTY."

which is pretty much exactly the problem we fixed in a5660b41af.

Milton Miller pointed out the 'unsigned int' issue.

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reported-by: Milton Miller <miltonm@bga.com>
Cc: Stefan Bigler <stefan.bigler@keymile.com>
Cc: Toby Gray <toby.gray@realvnc.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-04 06:33:24 +09:00
Rafał Miłecki
27f18dc2da bcma: read SPROM and extract MAC from it
In case of BCMA cards SPROM is located in the ChipCommon core, it is
not mapped as separated host window. So far we have met only SPROMs rev
8.
SPROM layout seems to be the same as for SSB buses, so we decided to
share SPROM struct and some defines.
For now we extract MAC address only, this can be improved of course.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-03 15:01:07 -04:00
Arend van Spriel
10f8113ecb lib: cordic: add library module providing cordic angle calculation
The brcm80211 driver in the staging tree has a cordic function to
determine cosine and sine for a given angle. Feedback received from
John Linville suggested that these kind of functions should be made
available to others as a library function in the kernel tree. The
b43 driver also has a cordic angle calculation implemented.

Cc: linux-kernel@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Greg Kroah-Hartman <gregkh@suse.de>
Cc: Dan Carpenter <error27@gmail.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Cc: Larry Finger <Larry.Finger@lwfinger.net>
Reviewed-by: Roland Vossen <rvossen@broadcom.com>
Reviewed-by: Henry Ptasinski <henryp@broadcom.com>
Reviewed-by: Franky (Zhenhui) Lin <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-03 15:01:07 -04:00
Arend van Spriel
7150962d63 lib: crc8: add new library module providing crc8 algorithm
The brcm80211 driver in staging tree uses a crc8 function. Based on
feedback from John Linville to move this to lib directory, the linux
source has been searched. Although there is currently only one other
kernel driver using this algorithm (ie. drivers/ssb) we are providing
this as a library function for others to use.

Cc: linux-kernel@vger.kernel.org
Cc: linux-wireless@vger.kernel.org
Cc: Dan Carpenter <error27@gmail.com>
Cc: George Spelvin <linux@horizon.com>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Reviewed-by: Henry Ptasinski <henryp@broadcom.com>
Reviewed-by: Roland Vossen <rvossen@broadcom.com>
Reviewed-by: "Franky (Zhenhui) Lin" <frankyl@broadcom.com>
Signed-off-by: Arend van Spriel <arend@broadcom.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-03 15:01:06 -04:00
John W. Linville
7b29dc21ea Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6 into for-davem 2011-06-03 14:31:50 -04:00
H Hartley Sweeten
a3cc68c378 gpio/74x164: remove unnecessary defines and prototype
Remove the #define GEN_74X164_GPIO_COUNT since it's only used in
one place and it's meaning is obvious.  Also remove the #define
GEN_74X164_DRIVER_NAME and use spi->modalias to set the gpio chip's
label and the string "74x164" for the driver name.

Reorder the code slightly to remove the need to prototype
gen_74x164_set_value.

Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-03 12:14:16 -06:00
Chris Metcalf
d4d84fef6d slub: always align cpu_slab to honor cmpxchg_double requirement
On an architecture without CMPXCHG_LOCAL but with DEBUG_VM enabled,
the VM_BUG_ON() in __pcpu_double_call_return_bool() will cause an early
panic during boot unless we always align cpu_slab properly.

In principle we could remove the alignment-testing VM_BUG_ON() for
architectures that don't have CMPXCHG_LOCAL, but leaving it in means
that new code will tend not to break x86 even if it is introduced
on another platform, and it's low cost to require alignment.

Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-06-03 19:33:49 +03:00
Sebastian Andrzej Siewior
3a43e05f4d irq: Handle spurios irq detection for threaded irqs
The detection of spurios interrupts is currently limited to first level
handler. In force-threaded mode we never notice if the threaded irq does
not feel responsible.
This patch catches the return value of the threaded handler and forwards
it to the spurious detector. If the primary handler returns only
IRQ_WAKE_THREAD then the spourious detector ignores it because it gets
called again from the threaded handler.

[ tglx: Report the erroneous return value early and bail out ]

Signed-off-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Link: http://lkml.kernel.org/r/1306824972-27067-2-git-send-email-sebastian@breakpoint.cc
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-03 14:53:15 +02:00
Ben Greear
a3bcc23e89 af-packet: Add flag to distinguish VID 0 from no-vlan.
Currently, user-space cannot determine if a 0 tcp_vlan_tci
means there is no VLAN tag or the VLAN ID was zero.

Add flag to make this explicit.  User-space can check for
TP_STATUS_VLAN_VALID || tp_vlan_tci > 0, which will be backwards
compatible. Older could would have just checked for tp_vlan_tci,
so it will work no worse than before.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-01 21:18:03 -07:00
Linus Torvalds
f0f52a9463 Merge git://git.infradead.org/iommu-2.6
* git://git.infradead.org/iommu-2.6:
  intel-iommu: Fix off-by-one in RMRR setup
  intel-iommu: Add domain check in domain_remove_one_dev_info
  intel-iommu: Remove Host Bridge devices from identity mapping
  intel-iommu: Use coherent DMA mask when requested
  intel-iommu: Dont cache iova above 32bit
  intel-iommu: Speed up processing of the identity_mapping function
  intel-iommu: Check for identity mapping candidate using system dma mask
  intel-iommu: Only unlink device domains from iommu
  intel-iommu: Enable super page (2MiB, 1GiB, etc.) support
  intel-iommu: Flush unmaps at domain_exit
  intel-iommu: Remove obsolete comment from detect_intel_iommu
  intel-iommu: fix VT-d PMR disable for TXT on S3 resume
2011-06-02 05:48:50 +09:00
Rafał Miłecki
9d75ef0f8f bcma: host pci: implement block R/W operations
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-01 15:12:28 -04:00
Rafał Miłecki
1bdcd095e3 bcma: add IRQ number and pointer to DMA dev
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-01 15:10:58 -04:00
Eliad Peller
333ba73252 cfg80211: don't drop p2p probe responses
Commit 0a35d36 ("cfg80211: Use capability info to detect mesh beacons")
assumed that probe response with both ESS and IBSS bits cleared
means that the frame was sent by a mesh sta.

However, these capabilities are also being used in the p2p_find phase,
and the mesh-validation broke it.

Rename the WLAN_CAPABILITY_IS_MBSS macro, and verify that mesh ies
exist before assuming this frame was sent by a mesh sta.

Signed-off-by: Eliad Peller <eliad@wizery.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-01 14:34:01 -04:00
Linus Torvalds
3f303103b8 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6:
  mtd: fix physmap.h warnings
2011-06-01 21:47:39 +09:00
Youquan Song
6dd9a7c737 intel-iommu: Enable super page (2MiB, 1GiB, etc.) support
There are no externally-visible changes with this. In the loop in the
internal __domain_mapping() function, we simply detect if we are mapping:
  - size >= 2MiB, and
  - virtual address aligned to 2MiB, and
  - physical address aligned to 2MiB, and
  - on hardware that supports superpages.

(and likewise for larger superpages).

We automatically use a superpage for such mappings. We never have to
worry about *breaking* superpages, since we trust that we will always
*unmap* the same range that was mapped. So all we need to do is ensure
that dma_pte_clear_range() will also cope with superpages.

Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
it can return a PTE at the appropriate level rather than always
extending the page tables all the way down to level 1. Again, this is
simplified by the fact that we should never encounter existing small
pages when we're creating a mapping; any old mapping that used the same
virtual range will have been entirely removed and its obsolete page
tables freed.

Provide an 'intel_iommu=sp_off' argument on the command line as a
chicken bit. Not that it should ever be required.

==

The original commit seen in the iommu-2.6.git was Youquan's
implementation (and completion) of my own half-baked code which I'd
typed into an email. Followed by half a dozen subsequent 'fixes'.

I've taken the unusual step of rewriting history and collapsing the
original commits in order to keep the main history simpler, and make
life easier for the people who are going to have to backport this to
older kernels. And also so I can give it a more coherent commit comment
which (hopefully) gives a better explanation of what's going on.

The original sequence of commits leading to identical code was:

Youquan Song (3):
      intel-iommu: super page support
      intel-iommu: Fix superpage alignment calculation error
      intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()

David Woodhouse (4):
      intel-iommu: Precalculate superpage support for dmar_domain
      intel-iommu: Fix hardware_largepage_caps()
      intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
      intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages

Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 12:26:35 +01:00
Randy Dunlap
63da029015 mtd: fix physmap.h warnings
Fix build warnings in physmap.h:

include/linux/mtd/physmap.h:25: warning: 'struct platform_device' declared inside parameter list
include/linux/mtd/physmap.h:25: warning: its scope is only this definition or declaration, which is probably not what you want
include/linux/mtd/physmap.h:26: warning: 'struct platform_device' declared inside parameter list
include/linux/mtd/physmap.h:27: warning: 'struct platform_device' declared inside parameter list

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-06-01 11:36:49 +01:00
Peter Zijlstra
f339b9dc1f sched: Fix schedstat.nr_wakeups_migrate
While looking over the code I found that with the ttwu rework the
nr_wakeups_migrate test broke since we now switch cpus prior to
calling ttwu_stat(), hence the test is always true.

Cure this by passing the migration state in wake_flags. Also move the
whole test under CONFIG_SMP, its hard to migrate tasks on UP :-)

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-pwwxl7gdqs5676f1d4cx6pj7@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-31 14:19:57 +02:00
Namhyung Kim
ea9d6553b3 block: remove unwanted semicolons
Since those defined functions require additional semicolon
from the caller, they could cause potential syntax errors
when used in if-else statements.

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Acked-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-05-31 13:45:53 +02:00
Jens Axboe
a1706ac4c0 Revert "block: Remove extra discard_alignment from hd_struct."
It was not a good idea to start dereferencing disk->queue from
the fs sysfs strategy for displaying discard alignment. We ran
into first a NULL pointer deref, and after fixing that we sometimes
see unvalid disk->queue pointer values.

Since discard is the only one of the bunch actually looking into
the queue, just revert the change.

This reverts commit 23ceb5b771.

Conflicts:
	fs/partitions/check.c
2011-05-30 07:42:51 +02:00
Michael S. Tsirkin
7ab358c23c virtio: add api for delayed callbacks
Add an API that tells the other side that callbacks
should be delayed until a lot of work has been done.
Implement using the new event_idx feature.

Note: it might seem advantageous to let the drivers
ask for a callback after a specific capacity has
been reached. However, as a single head can
free many entries in the descriptor table,
we don't really have a clue about capacity
until get_buf is called. The API is the simplest
to implement at the moment, we'll see what kind of
hints drivers can pass when there's more than one
user of the feature.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-30 11:14:16 +09:30
Michael S. Tsirkin
bf7035bf20 virtio ring: inline function to check for events
With the new used_event and avail_event and features, both
host and guest need similar logic to check whether events are
enabled, so it helps to put the common code in the header.

Note that Xen has similar logic for notification hold-off
in include/xen/interface/io/ring.h with req_event and req_prod
corresponding to event_idx + 1 and new_idx respectively.
+1 comes from the fact that req_event and req_prod in Xen start at 1,
while event index in virtio starts at 0.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-30 11:14:14 +09:30
Michael S. Tsirkin
770b31a85e virtio: event index interface
Define a new feature bit for the guest and host to utilize
an event index (like Xen) instead if a flag bit to enable/disable
interrupts and kicks.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2011-05-30 11:14:14 +09:30
Rusty Russell
a1b383870a virtio: add full three-clause BSD text to headers.
It's unclear to me if it's important, but it's obviously causing my
technical colleages some headaches and I'd hate such imprecision to
slow virtio adoption.

I've emailed this to all non-trivial contributors for approval, too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: Ryan Harper <ryanh@us.ibm.com>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Eric Van Hensbergen <ericvh@gmail.com>
Acked-by: john cooper <john.cooper@redhat.com>
Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Fernando Luis Vazquez Cao <fernando@oss.ntt.co.jp>
2011-05-30 11:14:14 +09:30
Linus Torvalds
cd1acdf172 Merge branch 'pnfs-submit' of git://git.open-osd.org/linux-open-osd
* 'pnfs-submit' of git://git.open-osd.org/linux-open-osd: (32 commits)
  pnfs-obj: pg_test check for max_io_size
  NFSv4.1: define nfs_generic_pg_test
  NFSv4.1: use pnfs_generic_pg_test directly by layout driver
  NFSv4.1: change pg_test return type to bool
  NFSv4.1: unify pnfs_pageio_init functions
  pnfs-obj: objlayout_encode_layoutcommit implementation
  pnfs: encode_layoutcommit
  pnfs-obj: report errors and .encode_layoutreturn Implementation.
  pnfs: encode_layoutreturn
  pnfs: layoutret_on_setattr
  pnfs: layoutreturn
  pnfs-obj: osd raid engine read/write implementation
  pnfs: support for non-rpc layout drivers
  pnfs-obj: define per-inode private structure
  pnfs: alloc and free layout_hdr layoutdriver methods
  pnfs-obj: objio_osd device information retrieval and caching
  pnfs-obj: decode layout, alloc/free lseg
  pnfs-obj: pnfs_osd XDR client implementation
  pnfs-obj: pnfs_osd XDR definitions
  pnfs-obj: objlayoutdriver module skeleton
  ...
2011-05-29 14:10:13 -07:00
Linus Torvalds
6345d24daf mm: Fix boot crash in mm_alloc()
Thomas Gleixner reports that we now have a boot crash triggered by
CONFIG_CPUMASK_OFFSTACK=y:

    BUG: unable to handle kernel NULL pointer dereference at   (null)
    IP: [<c11ae035>] find_next_bit+0x55/0xb0
    Call Trace:
     [<c11addda>] cpumask_any_but+0x2a/0x70
     [<c102396b>] flush_tlb_mm+0x2b/0x80
     [<c1022705>] pud_populate+0x35/0x50
     [<c10227ba>] pgd_alloc+0x9a/0xf0
     [<c103a3fc>] mm_init+0xec/0x120
     [<c103a7a3>] mm_alloc+0x53/0xd0

which was introduced by commit de03c72cfc ("mm: convert
mm->cpu_vm_cpumask into cpumask_var_t"), and is due to wrong ordering of
mm_init() vs mm_init_cpumask

Thomas wrote a patch to just fix the ordering of initialization, but I
hate the new double allocation in the fork path, so I ended up instead
doing some more radical surgery to clean it all up.

Reported-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: Ingo Molnar <mingo@elte.hu>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-29 11:32:28 -07:00
Linus Torvalds
cab0d85c8d Merge branch 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6
* 'for-linus' of git://git390.marist.edu/pub/scm/linux-2.6:
  [S390] mm: fix mmu_gather rework
  [S390] mm: fix storage key handling
2011-05-29 11:30:20 -07:00
Linus Torvalds
a74d70b63f Merge branch 'for-2.6.40' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.40' of git://linux-nfs.org/~bfields/linux: (22 commits)
  nfsd: make local functions static
  NFSD: Remove unused variable from nfsd4_decode_bind_conn_to_session()
  NFSD: Check status from nfsd4_map_bcts_dir()
  NFSD: Remove setting unused variable in nfsd_vfs_read()
  nfsd41: error out on repeated RECLAIM_COMPLETE
  nfsd41: compare request's opcnt with session's maxops at nfsd4_sequence
  nfsd v4.1 lOCKT clientid field must be ignored
  nfsd41: add flag checking for create_session
  nfsd41: make sure nfs server process OPEN with EXCLUSIVE4_1 correctly
  nfsd4: fix wrongsec handling for PUTFH + op cases
  nfsd4: make fh_verify responsibility of nfsd_lookup_dentry caller
  nfsd4: introduce OPDESC helper
  nfsd4: allow fh_verify caller to skip pseudoflavor checks
  nfsd: distinguish functions of NFSD_MAY_* flags
  svcrpc: complete svsk processing on cb receive failure
  svcrpc: take advantage of tcp autotuning
  SUNRPC: Don't wait for full record to receive tcp data
  svcrpc: copy cb reply instead of pages
  svcrpc: close connection if client sends short packet
  svcrpc: note network-order types in svc_process_calldir
  ...
2011-05-29 11:21:12 -07:00
Linus Torvalds
b11b06d90a Merge git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm
* git://git.kernel.org/pub/scm/linux/kernel/git/agk/linux-2.6-dm:
  dm kcopyd: return client directly and not through a pointer
  dm kcopyd: reserve fewer pages
  dm io: use fixed initial mempool size
  dm kcopyd: alloc pages from the main page allocator
  dm kcopyd: add gfp parm to alloc_pl
  dm kcopyd: remove superfluous page allocation spinlock
  dm kcopyd: preallocate sub jobs to avoid deadlock
  dm kcopyd: avoid pointless job splitting
  dm mpath: do not fail paths after integrity errors
  dm table: reject devices without request fns
  dm table: allow targets to support discards internally
2011-05-29 11:20:48 -07:00
Linus Torvalds
f1d1c9fa8f Merge branch 'nfs-for-2.6.40' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'nfs-for-2.6.40' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  SUNRPC: Support for RPC over AF_LOCAL transports
  SUNRPC: Remove obsolete comment
  SUNRPC: Use AF_LOCAL for rpcbind upcalls
  SUNRPC: Clean up use of curly braces in switch cases
  NFS: Revert NFSROOT default mount options
  SUNRPC: Rename xs_encode_tcp_fragment_header()
  nfs,rcu: convert call_rcu(nfs_free_delegation_callback) to kfree_rcu()
  nfs41: Correct offset for LAYOUTCOMMIT
  NFS: nfs_update_inode: print current and new inode size in debug output
  NFSv4.1: Fix the handling of NFS4ERR_SEQ_MISORDERED errors
  NFSv4: Handle expired stateids when the lease is still valid
  SUNRPC: Deal with the lack of a SYN_SENT sk->sk_state_change callback...
2011-05-29 11:20:02 -07:00
Linus Torvalds
daa94222b6 Merge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6
* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  ACPI EC: remove redundant code
  ACPI: Add D3 cold state
  ACPI: processor: fix processor_physically_present in UP kernel
  ACPI: Split out custom_method functionality into an own driver
  ACPI: Cleanup custom_method debug stuff
  ACPI EC: enable MSI workaround for Quanta laptops
  ACPICA: Update to version 20110413
  ACPICA: Execute an orphan _REG method under the EC device
  ACPICA: Move ACPI_NUM_PREDEFINED_REGIONS to a more appropriate place
  ACPICA: Update internal address SpaceID for DataTable regions
  ACPICA: Add more methods eligible for NULL package element removal
  ACPICA: Split all internal Global Lock functions to new file - evglock
  ACPI: EC: add another DMI check for ASUS hardware
  ACPI EC: remove dead code
  ACPICA: Fix code divergence of global lock handling
  ACPICA: Use acpi_os_create_lock interface
  ACPI: osl, add acpi_os_create_lock interface
  ACPI:Fix goto flows in thermal-sys
2011-05-29 11:19:16 -07:00
Linus Torvalds
f310642123 Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
  x86 idle: deprecate mwait_idle() and "idle=mwait" cmdline param
  x86 idle: deprecate "no-hlt" cmdline param
  x86 idle APM: deprecate CONFIG_APM_CPU_IDLE
  x86 idle floppy: deprecate disable_hlt()
  x86 idle: EXPORT_SYMBOL(default_idle, pm_idle) only when APM demands it
  x86 idle: clarify AMD erratum 400 workaround
  idle governor: Avoid lock acquisition to read pm_qos before entering idle
  cpuidle: menu: fixed wrapping timers at 4.294 seconds
2011-05-29 11:18:09 -07:00
Benny Halevy
18ad0a9f2c NFSv4.1: change pg_test return type to bool
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:56:54 +03:00
Benny Halevy
cbe8260369 pnfs: layoutreturn
NFSv4.1 LAYOUTRETURN implementation

Currently, does not support layout-type payload encoding.

Signed-off-by: Alexandros Batsakis <batsakis@netapp.com>
Signed-off-by: Andy Adamson <andros@citi.umich.edu>
Signed-off-by: Andy Adamson <andros@netapp.com>
Signed-off-by: Dean Hildebrand <dhildeb@us.ibm.com>
Signed-off-by: Fred Isaman <iisaman@citi.umich.edu>
Signed-off-by: Fred Isaman <iisaman@netapp.com>
Signed-off-by: Marc Eshel <eshel@almaden.ibm.com>
Signed-off-by: Zhang Jingwang <zhangjingwang@nrchpc.ac.cn>
[call pnfs_return_layout right before pnfs_destroy_layout]
[remove assert_spin_locked from pnfs_clear_lseg_list]
[remove wait parameter from the layoutreturn path.]
[remove return_type field from nfs4_layoutreturn_args]
[remove range from nfs4_layoutreturn_args]
[no need to send layoutcommit from _pnfs_return_layout]
[don't wait on sync layoutreturn]
[fix layout stateid in layoutreturn args]
[fixed NULL deref in _pnfs_return_layout]
[removed recaim member of nfs4_layoutreturn_args]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:54:36 +03:00
Benny Halevy
d20581aa4b pnfs: support for non-rpc layout drivers
Non-rpc layout driver such as for objects and blocks
implement their own I/O path and error handling logic.
Therefore bypass NFS-based error handling for these layout drivers.

[fix lseg ref-count bugs, and null de-refs]
[Fall out from: non-rpc layout drivers]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
[get rid of PNFS_USE_RPC_CODE]
[get rid of __nfs4_write_done_cb]
[revert useless change in nfs4_write_done_cb]
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:53:51 +03:00
Benny Halevy
38b7c401f6 pnfs-obj: pnfs_osd XDR definitions
* Add the pnfs_osd_xdr.h header

* defintions the pnfs_osd_layout structure including all it's
  sub-types and constants.
* Declare the pnfs_osd_xdr_decode_layout API + all needed
  inline helpers.

* Define the pnfs_osd_deviceaddr structure and all its subtypes and
  constants.
* Declare API for decoding of a pnfs_osd_deviceaddr from XDR stream.

* Define the pnfs_osd_ioerr structure, its substructures and constants.
* Declare API for encoding of a pnfs_osd_ioerr into XDR stream.

* Define the pnfs_osd_layoutupdate structure and its substructures.
* Declare API for encoding of a pnfs_osd_layoutupdate into XDR stream.

[Remove server definitions]
Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:52:36 +03:00
Benny Halevy
f7da7a129d SUNRPC: introduce xdr_init_decode_pages
Initialize xdr_stream and xdr_buf using an array of page pointers
and length of buffer.

Signed-off-by: Benny Halevy <bhalevy@panasas.com>
2011-05-29 20:52:32 +03:00
Mikulas Patocka
fa34ce7307 dm kcopyd: return client directly and not through a pointer
Return client directly from dm_kcopyd_client_create, not through a
parameter, making it consistent with dm_io_client_create.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-05-29 13:03:13 +01:00
Mikulas Patocka
5f43ba2950 dm kcopyd: reserve fewer pages
Reserve just the minimum of pages needed to process one job.

Because we allocate pages from page allocator, we don't need to reserve
a large number of pages.  The maximum job size is SUB_JOB_SIZE and we
calculate the number of reserved pages based on this.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-05-29 13:03:11 +01:00
Mikulas Patocka
bda8efec5c dm io: use fixed initial mempool size
Replace the arbitrary calculation of an initial io struct mempool size
with a constant.

The code calculated the number of reserved structures based on the request
size and used a "magic" multiplication constant of 4.  This patch changes
it to reserve a fixed number - itself still chosen quite arbitrarily.
Further testing might show if there is a better number to choose.

Note that if there is no memory pressure, we can still allocate an
arbitrary number of "struct io" structures.  One structure is enough to
process the whole request.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-05-29 13:03:09 +01:00
Mike Snitzer
4c25932701 dm table: allow targets to support discards internally
Permit a target to support discards regardless of whether or not all its
underlying devices do.

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
2011-05-29 12:52:55 +01:00
Heiko Carstens
a43a9d93d4 [S390] mm: fix storage key handling
page_get_storage_key() and page_set_storage_key() expect a page address
and not its page frame number. This got inconsistent with 2d42552d
"[S390] merge page_test_dirty and page_clear_dirty".

Result is that we read/write storage keys from random pages and do not
have a working dirty bit tracking at all.
E.g. SetPageUpdate() doesn't clear the dirty bit of requested pages, which
for example ext4 doesn't like very much and panics after a while.

Unable to handle kernel paging request at virtual user address (null)
Oops: 0004 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in:
CPU: 1 Not tainted 2.6.39-07551-g139f37f-dirty #152
Process flush-94:0 (pid: 1576, task: 000000003eb34538, ksp: 000000003c287b70)
Krnl PSW : 0704c00180000000 0000000000316b12 (jbd2_journal_file_inode+0x10e/0x138)
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 EA:3
Krnl GPRS: 0000000000000000 0000000000000000 0000000000000000 0700000000000000
           0000000000316a62 000000003eb34cd0 0000000000000025 000000003c287b88
           0000000000000001 000000003c287a70 000000003f1ec678 000000003f1ec000
           0000000000000000 000000003e66ec00 0000000000316a62 000000003c287988
Krnl Code: 0000000000316b04: f0a0000407f4       srp     4(11,%r0),2036,0
           0000000000316b0a: b9020022           ltgr    %r2,%r2
           0000000000316b0e: a7740015           brc     7,316b38
          >0000000000316b12: e3d0c0000024       stg     %r13,0(%r12)
           0000000000316b18: 4120c010           la      %r2,16(%r12)
           0000000000316b1c: 4130d060           la      %r3,96(%r13)
           0000000000316b20: e340d0600004       lg      %r4,96(%r13)
           0000000000316b26: c0e50002b567       brasl   %r14,36d5f4
Call Trace:
([<0000000000316a62>] jbd2_journal_file_inode+0x5e/0x138)
 [<00000000002da13c>] mpage_da_map_and_submit+0x2e8/0x42c
 [<00000000002daac2>] ext4_da_writepages+0x2da/0x504
 [<00000000002597e8>] writeback_single_inode+0xf8/0x268
 [<0000000000259f06>] writeback_sb_inodes+0xd2/0x18c
 [<000000000025a700>] writeback_inodes_wb+0x80/0x168
 [<000000000025aa92>] wb_writeback+0x2aa/0x324
 [<000000000025abde>] wb_do_writeback+0xd2/0x274
 [<000000000025ae3a>] bdi_writeback_thread+0xba/0x1c4
 [<00000000001737be>] kthread+0xa6/0xb0
 [<000000000056c1da>] kernel_thread_starter+0x6/0xc
 [<000000000056c1d4>] kernel_thread_starter+0x0/0xc
INFO: lockdep is turned off.
Last Breaking-Event-Address:
 [<0000000000316a8a>] jbd2_journal_file_inode+0x86/0x138

Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2011-05-29 12:40:51 +02:00
Len Brown
751516f0a9 Merge branch 'ec-cleanup' into release
Conflicts:
	drivers/platform/x86/compal-laptop.c
2011-05-29 04:40:39 -04:00
Tim Chen
333c5ae994 idle governor: Avoid lock acquisition to read pm_qos before entering idle
Thanks to the reviews and comments by Rafael, James, Mark and Andi.
Here's version 2 of the patch incorporating your comments and also some
update to my previous patch comments.

I noticed that before entering idle state, the menu idle governor will
look up the current pm_qos target value according to the list of qos
requests received.  This look up currently needs the acquisition of a
lock to access the list of qos requests to find the qos target value,
slowing down the entrance into idle state due to contention by multiple
cpus to access this list.  The contention is severe when there are a lot
of cpus waking and going into idle.  For example, for a simple workload
that has 32 pair of processes ping ponging messages to each other, where
64 cpu cores are active in test system, I see the following profile with
37.82% of cpu cycles spent in contention of pm_qos_lock:

-     37.82%          swapper  [kernel.kallsyms]          [k]
_raw_spin_lock_irqsave
   - _raw_spin_lock_irqsave
      - 95.65% pm_qos_request
           menu_select
           cpuidle_idle_call
         - cpu_idle
              99.98% start_secondary

A better approach will be to cache the updated pm_qos target value so
reading it does not require lock acquisition as in the patch below.
With this patch the contention for pm_qos_lock is removed and I saw a
2.2X increase in throughput for my message passing workload.

cc: stable@kernel.org
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: James Bottomley <James.Bottomley@suse.de>
Acked-by: mark gross <markgross@thegnar.org>
Signed-off-by: Len Brown <len.brown@intel.com>
2011-05-29 00:50:59 -04:00
Linus Torvalds
36947a7682 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (36 commits)
  Cache xattr security drop check for write v2
  fs: block_page_mkwrite should wait for writeback to finish
  mm: Wait for writeback when grabbing pages to begin a write
  configfs: remove unnecessary dentry_unhash on rmdir, dir rename
  fat: remove unnecessary dentry_unhash on rmdir, dir rename
  hpfs: remove unnecessary dentry_unhash on rmdir, dir rename
  minix: remove unnecessary dentry_unhash on rmdir, dir rename
  fuse: remove unnecessary dentry_unhash on rmdir, dir rename
  coda: remove unnecessary dentry_unhash on rmdir, dir rename
  afs: remove unnecessary dentry_unhash on rmdir, dir rename
  affs: remove unnecessary dentry_unhash on rmdir, dir rename
  9p: remove unnecessary dentry_unhash on rmdir, dir rename
  ncpfs: fix rename over directory with dangling references
  ncpfs: document dentry_unhash usage
  ecryptfs: remove unnecessary dentry_unhash on rmdir, dir rename
  hostfs: remove unnecessary dentry_unhash on rmdir, dir rename
  hfsplus: remove unnecessary dentry_unhash on rmdir, dir rename
  hfs: remove unnecessary dentry_unhash on rmdir, dir rename
  omfs: remove unnecessary dentry_unhash on rmdir, dir rneame
  udf: remove unnecessary dentry_unhash from rmdir, dir rename
  ...
2011-05-28 13:03:41 -07:00
Linus Torvalds
a947e23a8e Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, asm: Clean up desc.h a bit
  x86, amd: Do not enable ARAT feature on AMD processors below family 0x12
  x86: Move do_page_fault()'s error path under unlikely()
  x86, efi: Retain boot service code until after switching to virtual mode
  x86: Remove unnecessary check in detect_ht()
  x86: Reorder mm_context_t to remove x86_64 alignment padding and thus shrink mm_struct
  x86, UV: Clean up uv_tlb.c
  x86, UV: Add support for SGI UV2 hub chip
  x86, cpufeature: Update CPU feature RDRND to RDRAND
2011-05-28 12:57:01 -07:00
Linus Torvalds
08a8b79600 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  cpuset: Fix cpuset_cpus_allowed_fallback(), don't update tsk->rt.nr_cpus_allowed
  sched: Fix ->min_vruntime calculation in dequeue_entity()
  sched: Fix ttwu() for __ARCH_WANT_INTERRUPTS_ON_CTXSW
  sched: More sched_domain iterations fixes
2011-05-28 12:56:46 -07:00
Linus Torvalds
1ba4b8cb94 Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  rcu: Start RCU kthreads in TASK_INTERRUPTIBLE state
  rcu: Remove waitqueue usage for cpu, node, and boost kthreads
  rcu: Avoid acquiring rcu_node locks in timer functions
  atomic: Add atomic_or()
  Documentation: Add statistics about nested locks
  rcu: Decrease memory-barrier usage based on semi-formal proof
  rcu: Make rcu_enter_nohz() pay attention to nesting
  rcu: Don't do reschedule unless in irq
  rcu: Remove old memory barriers from rcu_process_callbacks()
  rcu: Add memory barriers
  rcu: Fix unpaired rcu_irq_enter() from locking selftests
2011-05-28 12:56:32 -07:00
Linus Torvalds
c4a227d89f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits)
  perf: Fix SIGIO handling
  perf top: Don't stop if no kernel symtab is found
  perf top: Handle kptr_restrict
  perf top: Remove unused macro
  perf events: initialize fd array to -1 instead of 0
  perf tools: Make sure kptr_restrict warnings fit 80 col terms
  perf tools: Fix build on older systems
  perf symbols: Handle /proc/sys/kernel/kptr_restrict
  perf: Remove duplicate headers
  ftrace: Add internal recursive checks
  tracing: Update btrfs's tracepoints to use u64 interface
  tracing: Add __print_symbolic_u64 to avoid warnings on 32bit machine
  ftrace: Set ops->flag to enabled even on static function tracing
  tracing: Have event with function tracer check error return
  ftrace: Have ftrace_startup() return failure code
  jump_label: Check entries limit in __jump_label_update
  ftrace/recordmcount: Avoid STT_FUNC symbols as base on ARM
  scripts/tags.sh: Add magic for trace-events for etags too
  scripts/tags.sh: Fix ctags for DEFINE_EVENT()
  x86/ftrace: Fix compiler warning in ftrace.c
  ...
2011-05-28 12:55:55 -07:00
Linus Torvalds
87367a0b71 Merge branch 'for-usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci
* 'for-usb-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci:
  Intel xhci: Limit number of active endpoints to 64.
  Intel xhci: Ignore spurious successful event.
  Intel xhci: Support EHCI/xHCI port switching.
  Intel xhci: Add PCI id for Panther Point xHCI host.
  xhci: STFU: Be quieter during URB submission and completion.
  xhci: STFU: Don't print event ring dequeue pointer.
  xhci: STFU: Remove function tracing.
  xhci: Don't submit commands when the host is dead.
  xhci: Clear stopped_td when Stop Endpoint command completes.
2011-05-28 12:36:15 -07:00
Linus Torvalds
4cb865deec Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx: (33 commits)
  x86: poll waiting for I/OAT DMA channel status
  maintainers: add dma engine tree details
  dmaengine: add TODO items for future work on dma drivers
  dmaengine: Add API documentation for slave dma usage
  dmaengine/dw_dmac: Update maintainer-ship
  dmaengine: move link order
  dmaengine/dw_dmac: implement pause and resume in dwc_control
  dmaengine/dw_dmac: Replace spin_lock* with irqsave variants and enable submission from callback
  dmaengine/dw_dmac: Divide one sg to many desc, if sg len is greater than DWC_MAX_COUNT
  dmaengine/dw_dmac: set residue as total len in dwc_tx_status if status is !DMA_SUCCESS
  dmaengine/dw_dmac: don't call callback routine in case dmaengine_terminate_all() is called
  dmaengine: at_hdmac: pause: no need to wait for FIFO empty
  pch_dma: modify pci device table definition
  pch_dma: Support new device ML7223 IOH
  pch_dma: Support I2S for ML7213 IOH
  pch_dma: Fix DMA setting issue
  pch_dma: modify for checkpatch
  pch_dma: fix dma direction issue for ML7213 IOH video-in
  dmaengine: at_hdmac: use descriptor chaining help function
  dmaengine: at_hdmac: implement pause and resume in atc_control
  ...

Fix up trivial conflict in drivers/dma/dw_dmac.c
2011-05-28 12:35:15 -07:00
Linus Torvalds
04830fccdc Merge branch 'gpio/next' of git://git.secretlab.ca/git/linux-2.6
* 'gpio/next' of git://git.secretlab.ca/git/linux-2.6:
  gpio/pch_gpio: Support new device ML7223
  gpio: make gpio_{request,free}_array gpio array parameter const
  GPIO: OMAP: move to drivers/gpio
  GPIO: OMAP: move register offset defines into <plat/gpio.h>
  gpio: Convert gpio_is_valid to return bool
  gpio: Move the s5pc100 GPIO to drivers/gpio
  gpio: Move the s5pv210 GPIO to drivers/gpio
  gpio: Move the exynos4 GPIO to drivers/gpio
  gpio: Move to Samsung common GPIO library to drivers/gpio
  gpio/nomadik: add function to read GPIO pull down status
  gpio/nomadik: show all pins in debug
  gpio: move Nomadik GPIO driver to drivers/gpio
  gpio: move U300 GPIO driver to drivers/gpio
  langwell_gpio: add runtime pm support
  gpio/pca953x: Add support for pca9574 and pca9575 devices
  gpio/cs5535: Show explicit dependency between gpio_cs5535 and mfd_cs5535
2011-05-28 10:56:34 -07:00
Andi Kleen
69b4573296 Cache xattr security drop check for write v2
Some recent benchmarking on btrfs showed that a major scaling bottleneck
on large systems on btrfs is currently the xattr lookup on every write.

Why xattr lookup on every write I hear you ask?

write wants to drop suid and security related xattrs that could set o
capabilities for executables.  To do that it currently looks up
security.capability on EVERY write (even for non executables) to decide
whether to drop it or not.

In btrfs this causes an additional tree walk, hitting some per file system
locks and quite bad scalability. In a simple read workload on a 8S
system I saw over 90% CPU time in spinlocks related to that.

Chris Mason tells me this is also a problem in ext4, where it hits
the global mbcache lock.

This patch adds a simple per inode to avoid this problem.  We only
do the lookup once per file and then if there is no xattr cache
the decision. All xattr changes clear the flag.

I also used the same flag to avoid the suid check, although
that one is pretty cheap.

A file system can also set this flag when it creates the inode,
if it has a cheap way to do so.  This is done for some common file systems
in followon patches.

With this patch a major part of the lock contention disappears
for btrfs. Some testing on smaller systems didn't show significant
performance changes, but at least it helps the larger systems
and is generally more efficient.

v2: Rename is_sgid. add file system helper.
Cc: chris.mason@oracle.com
Cc: josef@redhat.com
Cc: viro@zeniv.linux.org.uk
Cc: agruen@linbit.com
Cc: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2011-05-28 12:02:09 -04:00
Paul E. McKenney
55c2945aa9 atomic: Add atomic_or()
An atomic_or() function is needed by TREE_RCU to avoid deadlock, so
add a generic version.

Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-28 17:41:46 +02:00
KOSAKI Motohiro
1e1b6c511d cpuset: Fix cpuset_cpus_allowed_fallback(), don't update tsk->rt.nr_cpus_allowed
The rule is, we have to update tsk->rt.nr_cpus_allowed if we change
tsk->cpus_allowed. Otherwise RT scheduler may confuse.

Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4DD4B3FA.5060901@jp.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-28 17:02:57 +02:00
Linus Torvalds
29a6ccca38 Merge git://git.infradead.org/mtd-2.6
* git://git.infradead.org/mtd-2.6: (97 commits)
  mtd: kill CONFIG_MTD_PARTITIONS
  mtd: remove add_mtd_partitions, add_mtd_device and friends
  mtd: convert remaining users to mtd_device_register()
  mtd: samsung onenand: convert to mtd_device_register()
  mtd: omap2 onenand: convert to mtd_device_register()
  mtd: txx9ndfmc: convert to mtd_device_register()
  mtd: tmio_nand: convert to mtd_device_register()
  mtd: socrates_nand: convert to mtd_device_register()
  mtd: sharpsl: convert to mtd_device_register()
  mtd: s3c2410 nand: convert to mtd_device_register()
  mtd: ppchameleonevb: convert to mtd_device_register()
  mtd: orion_nand: convert to mtd_device_register()
  mtd: omap2: convert to mtd_device_register()
  mtd: nomadik_nand: convert to mtd_device_register()
  mtd: ndfc: convert to mtd_device_register()
  mtd: mxc_nand: convert to mtd_device_register()
  mtd: mpc5121_nfc: convert to mtd_device_register()
  mtd: jz4740_nand: convert to mtd_device_register()
  mtd: h1910: convert to mtd_device_register()
  mtd: fsmc_nand: convert to mtd_device_register()
  ...

Fixed up trivial conflicts in
 - drivers/mtd/maps/integrator-flash.c: removed in ARM tree
 - drivers/mtd/maps/physmap.c: addition of afs partition probe type
   clashing with removal of CONFIG_MTD_PARTITIONS
2011-05-27 20:06:53 -07:00
Linus Torvalds
2a56d22202 Merge branch 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm
* 'for-linus' of master.kernel.org:/home/rmk/linux-2.6-arm: (45 commits)
  ARM: 6945/1: Add unwinding support for division functions
  ARM: kill pmd_off()
  ARM: 6944/1: mm: allow ASID 0 to be allocated to tasks
  ARM: 6943/1: mm: use TTBR1 instead of reserved context ID
  ARM: 6942/1: mm: make TTBR1 always point to swapper_pg_dir on ARMv6/7
  ARM: 6941/1: cache: ensure MVA is cacheline aligned in flush_kern_dcache_area
  ARM: add sendmmsg syscall
  ARM: 6863/1: allow hotplug on msm
  ARM: 6832/1: mmci: support for ST-Ericsson db8500v2
  ARM: 6830/1: mach-ux500: force PrimeCell revisions
  ARM: 6829/1: amba: make hardcoded periphid override hardware
  ARM: 6828/1: mach-ux500: delete SSP PrimeCell ID
  ARM: 6827/1: mach-netx: delete hardcoded periphid
  ARM: 6940/1: fiq: Briefly document driver responsibilities for suspend/resume
  ARM: 6938/1: fiq: Refactor {get,set}_fiq_regs() for Thumb-2
  ARM: 6914/1: sparsemem: fix highmem detection when using SPARSEMEM
  ARM: 6913/1: sparsemem: allow pfn_valid to be overridden when using SPARSEMEM
  at91: drop at572d940hf support
  at91rm9200: introduce at91rm9200_set_type to specficy cpu package
  at91: drop boot_params and PLAT_PHYS_OFFSET
  ...
2011-05-27 19:51:32 -07:00
Lars-Peter Clausen
7c295975a8 gpio: make gpio_{request,free}_array gpio array parameter const
gpio_{request,free}_array should not (and do not) modify the passed gpio
array, so make the parameter const.

Signed-off-by: Lars-Peter Clausen <lars@metafoo.de>
Acked-by: Eric Miao <eric.y.miao@gmail.com>
Acked-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-05-27 17:56:45 -06:00
Chuck Lever
176e21ee2e SUNRPC: Support for RPC over AF_LOCAL transports
TI-RPC introduces the capability of performing RPC over AF_LOCAL
sockets.  It uses this mainly for registering and unregistering
local RPC services securely with the local rpcbind, but we could
also conceivably use it as a generic upcall mechanism.

This patch provides a client-side only implementation for the moment.
We might also consider a server-side implementation to provide
AF_LOCAL access to NLM (for statd downcalls, and such like).

Autobinding is not supported on kernel AF_LOCAL transports at this
time.  Kernel ULPs must specify the pathname of the remote endpoint
when an AF_LOCAL transport is created.  rpcbind supports registering
services available via AF_LOCAL, so the kernel could handle it with
some adjustment to ->rpcbind and ->set_port.  But we don't need this
feature for doing upcalls via well-known named sockets.

This has not been tested with ULPs that move a substantial amount of
data.  Thus, I can't attest to how robust the write_space and
congestion management logic is.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-05-27 17:42:47 -04:00
Linus Torvalds
10799db60c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
  net: Kill ratelimit.h dependency in linux/net.h
  net: Add linux/sysctl.h includes where needed.
  net: Kill ether_table[] declaration.
  inetpeer: fix race in unused_list manipulations
  atm: expose ATM device index in sysfs
  IPVS: bug in ip_vs_ftp, same list heaad used in all netns.
  bug.h: Move ratelimit warn interfaces to ratelimit.h
  bonding: cleanup module option descriptions
  net:8021q:vlan.c Fix pr_info to just give the vlan fullname and version.
  net: davinci_emac: fix dev_err use at probe
  can: convert to %pK for kptr_restrict support
  net: fix ETHTOOL_SFEATURES compatibility with old ethtool_ops.set_flags
  netfilter: Fix several warnings in compat_mtw_from_user().
  netfilter: ipset: fix ip_set_flush return code
  netfilter: ipset: remove unused variable from type_pf_tdel()
  netfilter: ipset: Use proper timeout value to jiffies conversion
2011-05-27 11:16:27 -07:00
David S. Miller
c5c177b4ac net: Kill ratelimit.h dependency in linux/net.h
Ingo Molnar noticed that we have this unnecessary ratelimit.h
dependency in linux/net.h, which hid compilation problems from
people doing builds only with CONFIG_NET enabled.

Move this stuff out to a seperate net/net_ratelimit.h file and
include that in the only two places where this thing is needed.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Ingo Molnar <mingo@elte.hu>
2011-05-27 13:41:33 -04:00