Commit Graph

25662 Commits

Author SHA1 Message Date
Rafał Miłecki
6f53912fc3 bcma: allow enabling PLL
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-19 17:03:11 -04:00
Rafał Miłecki
7424dd0d03 bcma: allow setting FAST clockmode for a core
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-19 17:03:11 -04:00
Rafał Miłecki
3de1a7748f bcma: trivial: add helpers for masking/setting
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-19 17:03:09 -04:00
Rafał Miłecki
bb932ad980 bcma: move define of BCMA_CLKCTLST register
Recent experiments have shown many cores share 0x1E0 register used for
clock management.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-19 17:03:08 -04:00
Johannes Berg
34850ab25d cfg80211: allow userspace to control supported rates in scan
Some P2P scans are not allowed to advertise
11b rates, but that is a rather special case
so instead of having that, allow userspace
to request the rate sets (per band) that are
advertised in scan probe request frames.

Since it's needed in two places now, factor
out some common code parsing a rate array.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-19 16:49:58 -04:00
Rafał Miłecki
2b5e3322b8 bcma: define IO status register
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-19 16:49:54 -04:00
Kalle Valo
856799d582 ieee80211: add few wmm tspec values
These are needed by ath6kl for parsing tspec status from an IE.

Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-19 16:49:54 -04:00
Rafał Miłecki
eb1577b7c4 bcma: handle alternative SPROM location
Some cards do not use additional 0x30 offset for SPROM location. We do
not know the real condition for it yet, make it BCM4331 specific for
now.

Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-19 16:49:53 -04:00
Florian Westphal
97d32cf944 netfilter: nfnetlink_queue: batch verdict support
Introduces a new nfnetlink type that applies a given
verdict to all queued packets with an id <= the id in the verdict
message.

If a mark is provided it is applied to all matched packets.

This reduces the number of verdicts that have to be sent.
Applications that make use of this feature need to maintain
a timeout to send a batchverdict periodically to avoid starvation.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-19 11:46:33 +02:00
Or Gerlitz
cfcde11c3d IB/mlx4: Use flow counters on IBoE ports
Allocate flow counter per Ethernet/IBoE port, and attach this counter
to all the QPs created on that port.  Based on patch by Eli Cohen
<eli@mellanox.co.il>.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18 21:04:36 -07:00
Or Gerlitz
f2a3f6a32c mlx4_core: Add network flow counters
ConnectX devices support a set of flow counters that can be attached
to a set containing one or more QPs.  Each such counter tracks receive
and transmit packets and bytes of these QPs.  This patch queries the
device to check support for counters, handles initialization of the
HCA to enable counters, and initializes a bitmap allocator to control
counter allocations.  Derived from patch by Eli Cohen <eli@mellanox.co.il>.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18 21:04:34 -07:00
Or Gerlitz
98a13e487a mlx4_core: Fix location of counter index in QP context struct
Fix the address handle portion of the QP context structure to have the
correct bit location for the counter index field.

Signed-off-by: Eli Cohen <eli@mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18 21:04:33 -07:00
Or Gerlitz
ccf8632196 mlx4_core: Read extended capabilities into the flags field
Query another dword containing up to 32 extended device capabilities
and merge it into struct mlx4_caps.flags.  Update the code that
handles the current extended device capabilities (e.g UDP RSS, WoL,
vep steering, etc) to use the extended device cap flags field instead
of a field per extended capability.  Initial patch done by Eli Cohen
<eli@mellanox.co.il>.

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18 21:04:32 -07:00
Or Gerlitz
52eafc68d6 mlx4_core: Extend capability flags to 64 bits
The latest firmware adds a second dword containing more device flags,
so extend the device capabilities flags field from 32 to 64 bits.
Derived from patch by Eli Cohen <eli@mellanox.co.il>

Signed-off-by: Or Gerlitz <ogerlitz@mellanox.co.il>
Signed-off-by: Roland Dreier <roland@purestorage.com>
2011-07-18 21:04:32 -07:00
Grant Likely
99ce39e359 dt: include linux/errno.h in linux/of_address.h
of_address.h makes reference to some of the error code #defines, so it
needs to include errno.h.  If CONFIG_PCI is not selected, then some files
will fail to compile.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-18 16:37:45 -06:00
Grant Likely
90e33f62e0 of/address: Add of_find_matching_node_by_address helper
of_find_matching_node_by_address() can be used to find a device tree
node for a device at a specific address.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-18 16:32:26 -06:00
Vladimir Zapolskiy
f701e5b73a connector: add an event for monitoring process tracers
This change adds a procfs connector event, which is emitted on every
successful process tracer attach or detach.

If some process connects to other one, kernelspace connector reports
process id and thread group id of both these involved processes. On
disconnection null process id is returned.

Such an event allows to create a simple automated userspace mechanism
to be aware about processes connecting to others, therefore predefined
process policies can be applied to them if needed.

Note, a detach signal is emitted only in case, if a tracer process
explicitly executes PTRACE_DETACH request. In other cases like tracee
or tracer exit detach event from proc connector is not reported.

Signed-off-by: Vladimir Zapolskiy <vzapolskiy@gmail.com>
Acked-by: Evgeniy Polyakov <zbr@ioremap.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-07-18 21:38:33 +02:00
Rafał Miłecki
09779aded8 ssb: SPROM: add LED duty cycle fields
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-18 14:29:04 -04:00
WANG Cong
a07c7964a2 include/linux/sdla.h: remove the prototype of sdla()
`make headers_check` complains that

linux-2.6/usr/include/linux/sdla.h:116: userspace cannot reference
function or variable defined in the kernel

this is due to that there is no such a kernel function,

void sdla(void *cfg_info, char *dev, struct frad_conf *conf, int quiet);

I don't know why we have it in a kernel header, so remove it.

Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-18 11:06:03 -07:00
Srinivas Kandagatla
61b8013a11 stmmac: Allow SOCs to use Store forward mode eventhough tx_coe is 0. (V2)
This patch adds new field 'force_sf_dma_mode' to plat_stmmacenet_data
struct to allow users to specify if they want to use force store forward
eventhough tx_coe is not available in hw.
without this flag stmmac driver will use cut-thru mode not use
store-forward mode.

Signed-off-by: Srinivas Kandagatla <srinivas.kandagatla@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-18 10:47:24 -07:00
Eric Dumazet
6b75e3e8d6 netfilter: nfnetlink: add RCU in nfnetlink_rcv_msg()
Goal of this patch is to permit nfnetlink providers not mandate
nfnl_mutex being held while nfnetlink_rcv_msg() calls them.

If struct nfnl_callback contains a non NULL call_rcu(), then
nfnetlink_rcv_msg() will use it instead of call() field, holding
rcu_read_lock instead of nfnl_mutex

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Florian Westphal <fw@strlen.de>
CC: Eric Leblond <eric@regit.org>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-07-18 16:08:07 +02:00
J. Bruce Fields
1091006c5e nfsd: turn on reply cache for NFSv4
It's sort of ridiculous that we've never had a working reply cache for
NFSv4.

On the other hand, we may still not: our current reply cache is likely
not very good, especially in the TCP case (which is the only case that
matters for v4).  What we really need here is some serious testing.

Anyway, here's a start.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-18 09:39:01 -04:00
Arnd Bergmann
bc574e190d Merge branches 'omap/prcm' and 'omap/mfd' of git+ssh://master.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc into next/devel-2 2011-07-17 21:48:22 +02:00
David Lamparter
69ecca86da net: vlan, qlcnic: make vlan_find_dev private
there is only one user of vlan_find_dev outside of the actual vlan code:
qlcnic uses it to iterate over some VLANs it knows.

let's just make vlan_find_dev private to the VLAN code and have the
iteration in qlcnic be a bit more direct. (a few rcu dereferences less
too)

Signed-off-by: David Lamparter <equinox@diac24.net>
Cc: Patrick McHardy <kaber@trash.net>
Cc: Amit Kumar Salecha <amit.salecha@qlogic.com>
Cc: Anirban Chakraborty <anirban.chakraborty@qlogic.com>
Cc: linux-driver@qlogic.com
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-17 12:33:22 -07:00
David Lamparter
178edcbc6b net: add 802.1ad / 802.1ah / QinQ ethertypes
define ETH_P_8021AD to 88a8 (assigned by IEEE) and add ETH_P_QINQ{1,2,3}
for the pre-standard 9{1,2,3}00 types. all of them use 802.1q frame
format, with 1 bit used differently in some cases.

also define ETH_P_8021AH to 88e7 (assigned by IEEE). this is Mac-in-Mac
and uses a different, 16-byte header.

Signed-off-by: David Lamparter <equinox@diac24.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-17 12:33:22 -07:00
Oleg Nesterov
d184d6eb1d ptrace: dont send SIGSTOP on auto-attach if PT_SEIZED
The fake SIGSTOP during attach has numerous problems. PTRACE_SEIZE
is already fine, but we have basically the same problems is SIGSTOP
is sent on auto-attach, the tracer can't know if this signal signal
should be cancelled or not.

Change ptrace_event() to set JOBCTL_TRAP_STOP if the new child is
PT_SEIZED, this triggers the PTRACE_EVENT_STOP report.

Thereafter a PT_SEIZED task can never report the bogus SIGSTOP.

Test-case:

	#define PTRACE_SEIZE		0x4206
	#define PTRACE_SEIZE_DEVEL	0x80000000
	#define PTRACE_EVENT_STOP	7
	#define WEVENT(s)		((s & 0xFF0000) >> 16)

	int main(void)
	{
		int child, grand_child, status;
		long message;

		child = fork();
		if (!child) {
			kill(getpid(), SIGSTOP);
			fork();
			assert(0);
			return 0x23;
		}

		assert(ptrace(PTRACE_SEIZE, child, 0,PTRACE_SEIZE_DEVEL) == 0);
		assert(wait(&status) == child);
		assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGSTOP);

		assert(ptrace(PTRACE_SETOPTIONS, child, 0, PTRACE_O_TRACEFORK) == 0);

		assert(ptrace(PTRACE_CONT, child, 0,0) == 0);
		assert(waitpid(child, &status, 0) == child);
		assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP);
		assert(WEVENT(status) == PTRACE_EVENT_FORK);

		assert(ptrace(PTRACE_GETEVENTMSG, child, 0, &message) == 0);
		grand_child = message;

		assert(waitpid(grand_child, &status, 0) == grand_child);
		assert(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP);
		assert(WEVENT(status) == PTRACE_EVENT_STOP);

		kill(child, SIGKILL);
		kill(grand_child, SIGKILL);
		return 0;
	}

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
2011-07-17 20:23:52 +02:00
Oleg Nesterov
dcace06cc2 ptrace: mv send-SIGSTOP from do_fork() to ptrace_init_task()
If the new child is traced, do_fork() adds the pending SIGSTOP.
It assumes that either it is traced because of auto-attach or the
tracer attached later, in both cases sigaddset/set_thread_flag is
correct even if SIGSTOP is already pending.

Now that we have PTRACE_SEIZE this is no longer right in the latter
case. If the tracer does PTRACE_SEIZE after copy_process() makes the
child visible the queued SIGSTOP is wrong.

We could check PT_SEIZED bit and change ptrace_attach() to set both
PT_PTRACED and PT_SEIZED bits simultaneously but see the next patch,
we need to know whether this child was auto-attached or not anyway.

So this patch simply moves this code to ptrace_init_task(), this
way we can never race with ptrace_attach().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
2011-07-17 20:23:51 +02:00
Oleg Nesterov
6634ae1033 ptrace_init_task: initialize child->jobctl explicitly
new_child->jobctl is not initialized during the fork, it is copied
from parent->jobctl. Currently this is harmless, the forking task
is running and copy_process() can't succeed if signal_pending() is
true, so only JOBCTL_STOP_DEQUEUED can be copied. Still this is a
bit fragile, it would be more clean to set ->jobctl = 0 explicitly.

Also, check ->ptrace != 0 instead of PT_PTRACED, move the
CONFIG_HAVE_HW_BREAKPOINT code up.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
2011-07-17 20:23:51 +02:00
David S. Miller
b23b5455b6 neigh: Kill hh_cache->hh_output
It's just taking on one of two possible values, either
neigh_ops->output or dev_queue_xmit().  And this is purely depending
upon whether nud_state has NUD_CONNECTED set or not.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-16 17:45:02 -07:00
Stefan Richter
f6a7cd0212 firewire: cdev: ABI documentation enhancements
Add overview documentation in Documentation/ABI/stable/firewire-cdev.

Improve the inline reference documentation in firewire-cdev.h:

  - Add /* available since kernel... */ comments to event numbers
    consistent with the comments on ioctl numbers.

  - Shorten some documentation on an event and an ioctl that are
    less interesting to current programming because there are newer
    preferable variants.

  - Spell Configuration ROM (name of an IEEE 1212 register) in
    upper case.

  - Move the dummy FW_CDEV_VERSION out of the reader's field of
    vision.  We should remove it from the header next year or so.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
2011-07-16 07:24:32 +02:00
Stefan Richter
93b37905f7 firewire: cdev: prevent race between first get_info ioctl and bus reset event queuing
Between open(2) of a /dev/fw* and the first FW_CDEV_IOC_GET_INFO
ioctl(2) on it, the kernel already queues FW_CDEV_EVENT_BUS_RESET events
to be read(2) by the client.  The get_info ioctl is practically always
issued right away after open, hence this condition only occurs if the
client opens during a bus reset, especially during a rapid series of bus
resets.

The problem with this condition is twofold:

  - These bus reset events carry the (as yet undocumented) @closure
    value of 0.  But it is not the kernel's place to choose closures;
    they are privat to the client.  E.g., this 0 value forced from the
    kernel makes it unsafe for clients to dereference it as a pointer to
    a closure object without NULL pointer check.

  - It is impossible for clients to determine the relative order of bus
    reset events from get_info ioctl(2) versus those from read(2),
    except in one way:  By comparison of closure values.  Again, such a
    procedure imposes complexity on clients and reduces freedom in use
    of the bus reset closure.

So, change the ABI to suppress queuing of bus reset events before the
first FW_CDEV_IOC_GET_INFO ioctl was issued by the client.

Note, this ABI change cannot be version-controlled.  The kernel cannot
distinguish old from new clients before the first FW_CDEV_IOC_GET_INFO
ioctl.

We will try to back-merge this change into currently maintained stable/
longterm series, and we only document the new behaviour.  The old
behavior is now considered a kernel bug, which it basically is.

Signed-off-by: Stefan Richter <stefanr@s5r6.in-berlin.de>
Cc: <stable@kernel.org>
2011-07-16 07:24:32 +02:00
Grant Likely
8c11642a50 Merge commit 'v3.0-rc7' into devicetree/next 2011-07-15 20:11:34 -06:00
NeilBrown
49b28684fd nfsd: Remove deprecated nfsctl system call and related code.
As promised in feature-removal-schedule.txt it is time to
remove the nfsctl system call.

Userspace has perferred to not use this call throughout 2.6 and it has been
excluded in the default configuration since 2.6.36 (9 months ago).

So this patch removes all the code that was being compiled out.

There are still references to sys_nfsctl in various arch systemcall tables
and related code.  These should be cleaned out too, probably in the next
merge window.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
2011-07-15 18:58:42 -04:00
Rafael J. Wysocki
7ae033cc0d Merge branch 'pm-runtime' into for-linus
* pm-runtime:
  OMAP: PM: disable idle on suspend for GPIO and UART
  OMAP: PM: omap_device: add API to disable idle on suspend
  OMAP: PM: omap_device: add system PM methods for PM domain handling
  OMAP: PM: omap_device: conditionally use PM domain runtime helpers
  PM / Runtime: Add new helper function: pm_runtime_status_suspended()
  PM / Runtime: Consistent utilization of deferred_resume
  PM / Runtime: Prevent runtime_resume from racing with probe
  PM / Runtime: Replace "run-time" with "runtime" in documentation
  PM / Runtime: Improve documentation of enable, disable and barrier
  PM: Limit race conditions between runtime PM and system sleep (v2)
  PCI / PM: Detect early wakeup in pci_pm_prepare()
  PM / Runtime: Return special error code if runtime PM is disabled
  PM / Runtime: Update documentation of interactions with system sleep
2011-07-15 23:59:25 +02:00
Rafael J. Wysocki
ba1389d74f Merge branch 'pm-domains' into for-linus
* pm-domains: (33 commits)
  ARM / shmobile: Return -EBUSY from A4LC power off if A3RV is active
  PM / Domains: Take .power_off() error code into account
  ARM / shmobile: Use genpd_queue_power_off_work()
  ARM / shmobile: Use pm_genpd_poweroff_unused()
  PM / Domains: Introduce function to power off all unused PM domains
  PM / Domains: Queue up power off work only if it is not pending
  PM / Domains: Improve handling of wakeup devices during system suspend
  PM / Domains: Do not restore all devices on power off error
  PM / Domains: Allow callbacks to execute all runtime PM helpers
  PM / Domains: Do not execute device callbacks under locks
  PM / Domains: Make failing pm_genpd_prepare() clean up properly
  PM / Domains: Set device state to "active" during system resume
  ARM: mach-shmobile: sh7372 A3RV requires A4LC
  PM / Domains: Export pm_genpd_poweron() in header
  ARM: mach-shmobile: sh7372 late pm domain off
  ARM: mach-shmobile: Runtime PM late init callback
  ARM: mach-shmobile: sh7372 D4 support
  ARM: mach-shmobile: sh7372 A4MP support
  ARM: mach-shmobile: sh7372: make sure that fsi is peripheral of spu2
  ARM: mach-shmobile: sh7372 A3SG support
  ...
2011-07-15 23:59:09 +02:00
MyungJoo Ham
3b5fe85252 PM / Suspend: Add .suspend_again() callback to suspend_ops
A system or a device may need to control suspend/wakeup events. It may
want to wakeup the system after a predefined amount of time or at a
predefined event decided while entering suspend for polling or delayed
work. Then, it may want to enter suspend again if its predefined wakeup
condition is the only wakeup reason and there is no outstanding events;
thus, it does not wakeup the userspace unnecessary or unnecessary
devices and keeps suspended as long as possible (saving the power).

Enabling a system to wakeup after a specified time can be easily
achieved by using RTC. However, to enter suspend again immediately
without invoking userland and unrelated devices, we need additional
features in the suspend framework.

Such need comes from:

 1. Monitoring a critical device status without interrupts that can
wakeup the system. (in-suspend polling)
 An example is ambient temperature monitoring that needs to shut down
the system or a specific device function if it is too hot or cold. The
temperature of a specific device may be needed to be monitored as well;
e.g., a charger monitors battery temperature in order to stop charging
if overheated.

 2. Execute critical "delayed work" at suspend.
 A driver or a system/board may have a delayed work (or any similar
things) that it wants to execute at the requested time.
 For example, some chargers want to check the battery voltage some
time (e.g., 30 seconds) after the battery is fully charged and the
charger has stopped. Then, the charger restarts charging if the voltage
has dropped more than a threshold, which is smaller than "restart-charger"
voltage, which is a threshold to restart charging regardless of the
time passed.

This patch allows to add "suspend_again" callback at struct
platform_suspend_ops and let the "suspend_again" callback return true if
the system is required to enter suspend again after the current instance
of wakeup. Device-wise suspend_again implemented at dev_pm_ops or
syscore is not done because: a) suspend_again feature is usually under
platform-wise decision and controls the behavior of the whole platform
and b) There are very limited devices related to the usage cases of
suspend_again; chargers and temperature sensors are mentioned so far.

With suspend_again callback registered at struct platform_suspend_ops
suspend_ops in kernel/power/suspend.c with suspend_set_ops by the
platform, the suspend framework tries to enter suspend again by
looping suspend_enter() if suspend_again has returned true and there has
been no errors in the suspending sequence or pending wakeups (by
pm_wakeup_pending).

Tested at Exynos4-NURI.

[rjw: Fixed up kerneldoc comment for suspend_enter().]

Signed-off-by: MyungJoo Ham <myungjoo.ham@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-15 23:58:19 +02:00
Nishanth Menon
99f381d354 PM / OPP: Introduce function to free cpufreq table
cpufreq table allocated by opp_init_cpufreq_table is better
freed by OPP layer itself. This allows future modifications to
the table handling to be transparent to the users.

Signed-off-by: Nishanth Menon <nm@ti.com>
Acked-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-15 23:58:18 +02:00
Peter Korsgaard
1bb6f9b042 mcp23s08: get rid of setup/teardown callbacks
There's no in-tree users, and bus notifiers are more generic anyway.

Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-15 13:54:17 -06:00
Luciano Coelho
5a865bad44 nl80211/cfg80211: add max_sched_scan_ie_len in the hw description
Some chips may support different lengths of user-supplied IEs with a
single scheduled scan command than with a single normal scan command.

To support this, this patch creates a separate hardware description
element that describes the maximum size of user-supplied information
element data supported in scheduled scans.

Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-15 13:38:30 -04:00
Luciano Coelho
93b6aa693a nl80211/cfg80211: add max_sched_scan_ssids in the hw description
Some chips can scan more SSIDs with a single scheduled scan command
than with a single normal scan command (eg. wl12xx chips).

To support this, this patch creates a separate hardware description
element that describes the amount of SSIDs supported in scheduled
scans.

Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-15 13:38:29 -04:00
Johannes Berg
77dbbb1389 nl80211: advertise GTK rekey support, new triggers
Since we now have the necessary API in place to support
GTK rekeying, applications will need to know whether it
is supported by a device. Add a pseudo-trigger that is
used only to advertise that capability. Also, add some
new triggers that match what iwlagn devices can do.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-15 13:38:28 -04:00
John W. Linville
95a943c162 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	net/bluetooth/l2cap_core.c
2011-07-15 10:05:24 -04:00
Andy Lutomirski
574c44fa8f ia64: Replace clocksource.fsys_mmio with generic arch data
Now that clocksource.archdata is available, use it for ia64-specific
code.

Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: linux-ia64@vger.kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/d31de0ee0842a0e322fb6441571c2b0adb323fa2.1310563276.git.luto@mit.edu
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-14 17:57:09 -07:00
Michał Mirosław
62f2a3a48b net: remove NETIF_F_ALL_TX_OFFLOADS
There is no software fallback implemented for SCTP or FCoE checksumming,
and so it should not be passed on by software devices like bridge or bonding.

For VLAN devices, this is different. First, the driver for underlying device
should be prepared to get offloaded packets even when the feature is disabled
(especially if it advertises it in vlan_features). Second, devices under
VLANs do not get replaced without tearing down the VLAN first.

This fixes a mess I accidentally introduced while converting bonding to
ndo_fix_features.

NETIF_F_SOFT_FEATURES are removed from BOND_VLAN_FEATURES because they
are unused as of commit 712ae51afd.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-14 15:18:49 -07:00
Michał Mirosław
e20e694073 net: remove SK_ROUTE_CAPS from meta ematch
Remove it, as it indirectly exposes netdev features. It's not used in
iproute2 (2.6.38) - is anything else using its interface?

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-14 14:45:59 -07:00
Michał Mirosław
fec30c3381 net: unexport netdev_fix_features()
It is not used anywhere except net/core/dev.c now.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-14 14:44:32 -07:00
Steven Rostedt
4a9bd3f134 tracing: Have dynamic size event stack traces
Currently the stack trace per event in ftace is only 8 frames.
This can be quite limiting and sometimes useless. Especially when
the "ignore frames" is wrong and we also use up stack frames for
the event processing itself.

Change this to be dynamic by adding a percpu buffer that we can
write a large stack frame into and then copy into the ring buffer.

For interrupts and NMIs that come in while another event is being
process, will only get to use the 8 frame stack. That should be enough
as the task that it interrupted will have the full stack frame anyway.

Requested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-14 16:36:53 -04:00
Rafael J. Wysocki
0bc5b2debb ARM / shmobile: Use genpd_queue_power_off_work()
Make pd_power_down_a3rv() use genpd_queue_power_off_work() to queue
up the powering off of the A4LC domain to avoid queuing it up when
it is pending.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
2011-07-14 20:59:07 +02:00
David S. Miller
6a7ebdf2fd Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	net/bluetooth/l2cap_core.c
2011-07-14 07:56:40 -07:00
David S. Miller
f6b72b6217 net: Embed hh_cache inside of struct neighbour.
Now that there is a one-to-one correspondance between neighbour
and hh_cache entries, we no longer need:

1) dynamic allocation
2) attachment to dst->hh
3) refcounting

Initialization of the hh_cache entry is indicated by hh_len
being non-zero, and such initialization is always done with
the neighbour's lock held as a writer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-14 07:53:20 -07:00
Glauber Costa
c9aaa8957f KVM: Steal time implementation
To implement steal time, we need the hypervisor to pass the guest
information about how much time was spent running other processes
outside the VM, while the vcpu had meaningful work to do - halt
time does not count.

This information is acquired through the run_delay field of
delayacct/schedstats infrastructure, that counts time spent in a
runqueue but not running.

Steal time is a per-cpu information, so the traditional MSR-based
infrastructure is used. A new msr, KVM_MSR_STEAL_TIME, holds the
memory area address containing information about steal time

This patch contains the hypervisor part of the steal time infrasructure,
and can be backported independently of the guest portion.

[avi, yongjie: export delayacct_on, to avoid build failures in some configs]

Signed-off-by: Glauber Costa <glommer@redhat.com>
Tested-by: Eric B Munson <emunson@mgebm.net>
CC: Rik van Riel <riel@redhat.com>
CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
CC: Peter Zijlstra <peterz@infradead.org>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Yongjie Ren <yongjie.ren@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-14 12:59:14 +03:00
Dave Airlie
5762a179b6 Merge branch 'drm-intel-next' of ssh://master.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6 into drm-core-next
* 'drm-intel-next' of ssh://master.kernel.org/pub/scm/linux/kernel/git/keithp/linux-2.6: (52 commits)
  drm/i915: provide module parameter description
  drm/i915: add module parameter compiler hints
  drm/i915/bios: Avoid temporary allocation whilst searching for downclock
  drm/i915: Cache GT fifo count for SandyBridge
  i915: Fix opregion notifications
  drm/i915: TVDAC_STATE_CHG does not indicate successful load-detect
  drm/i915: Select correct pipe during TV detect
  drm/i915/ringbuffer: Idling requires waiting for the ring to be empty
  Revert "drm/i915: enable rc6 by default"
  drm/i915: Clean up i915_driver_load failure path
  drm/i915: Enable i915 frame buffer compression by default
  drm/i915: Share the common work of disabling active FBC before updating
  drm/i915: Perform intel_enable_fbc() from a delayed task
  drm/i915: Disable FBC across page-flipping
  drm/i915: Set persistent-mode for ILK/SNB framebuffer compression
  drm/i915: Use of a CPU fence is mandatory to update FBC regions upon CPU writes
  drm/i915: Remove vestigial pitch from post-gen2 FBC control routines
  drm/i915: Replace direct calls to vfunc.disable_fbc with intel_disable_fbc()
  drm/i915: Only export the generic intel_disable_fbc() interface
  drm/i915: Enable GPU reset on Ivybridge.
  ...
2011-07-14 06:45:23 +01:00
Linus Torvalds
51414d4108 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: core: Bus width testing needs to handle suspend/resume
2011-07-13 16:47:31 -07:00
Richard Kennedy
d7b7630130 block: reorder request_queue to remove 64 bit alignment padding
Reorder request_queue to remove 16 bytes of alignment padding in 64 bit
builds.

On my config this shrinks the size of this structure from 1608 to 1592
bytes and therefore needs one fewer cachelines.

Also trivially move the open bracket { to be on the same line as the
structure name to make it easier to grep.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-13 21:17:49 +02:00
Philip Rakity
f39b2dd9d0 mmc: core: Bus width testing needs to handle suspend/resume
On reading the ext_csd for the first time (in 1 bit mode), save the
ext_csd information needed for bus width compare.

On every pass we make re-reading the ext_csd, compare the data
against the saved ext_csd data.

This fixes a regression introduced in 3.0-rc1 by 08ee80cc39
("mmc: core: eMMC bus width may not work on all platforms"), which
incorrectly assumed we would be re-reading the ext_csd at resume-
time.

Signed-off-by: Philip Rakity <prakity@marvell.com>
Tested-by: Jaehoon Chung <jh80.chung@samsung.com>
Signed-off-by: Chris Ball <cjb@laptop.org>
2011-07-13 14:54:37 -04:00
Andy Lutomirski
433bd805e5 clocksource: Replace vread with generic arch data
The vread field was bloating struct clocksource everywhere except
x86_64, and I want to change the way this works on x86_64, so let's
split it out into per-arch data.

Cc: x86@kernel.org
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: linux-ia64@vger.kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: John Stultz <johnstul@us.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andy Lutomirski <luto@mit.edu>
Link: http://lkml.kernel.org/r/3ae5ec76a168eaaae63f08a2a1060b91aa0b7759.1310563276.git.luto@mit.edu
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-13 11:23:12 -07:00
Rafael J. Wysocki
5125bbf388 PM / Domains: Introduce function to power off all unused PM domains
Add a new function pm_genpd_poweroff_unused() queuing up the
execution of pm_genpd_poweroff() for every initialized generic PM
domain.  Calling it will cause every generic PM domain without
devices in use to be powered off.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Magnus Damm <damm@opensource.se>
2011-07-13 12:31:52 +02:00
David S. Miller
5c25f686db net: Kill support for multiple hh_cache entries per neighbour
This never, ever, happens.

Neighbour entries are always tied to one address family, and therefore
one set of dst_ops, and therefore one dst_ops->protocol "hh_type"
value.

This capability was blindly imported by Alexey Kuznetsov when he wrote
the neighbour layer.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-13 02:29:59 -07:00
David S. Miller
e69dd336ee net: Push protocol type directly down to header_ops->cache()
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-13 02:29:59 -07:00
Tejun Heo
1e01979c8f x86, numa: Implement pfn -> nid mapping granularity check
SPARSEMEM w/o VMEMMAP and DISCONTIGMEM, both used only on 32bit, use
sections array to map pfn to nid which is limited in granularity.  If
NUMA nodes are laid out such that the mapping cannot be accurate, boot
will fail triggering BUG_ON() in mminit_verify_page_links().

On 32bit, it's 512MiB w/ PAE and SPARSEMEM.  This seems to have been
granular enough until commit 2706a0bf7b (x86, NUMA: Enable
CONFIG_AMD_NUMA on 32bit too).  Apparently, there is a machine which
aligns NUMA nodes to 128MiB and has only AMD NUMA but not SRAT.  This
led to the following BUG_ON().

 On node 0 totalpages: 2096615
   DMA zone: 32 pages used for memmap
   DMA zone: 0 pages reserved
   DMA zone: 3927 pages, LIFO batch:0
   Normal zone: 1740 pages used for memmap
   Normal zone: 220978 pages, LIFO batch:31
   HighMem zone: 16405 pages used for memmap
   HighMem zone: 1853533 pages, LIFO batch:31
 BUG: Int 6: CR2   (null)
      EDI   (null)  ESI 00000002  EBP 00000002  ESP c1543ecc
      EBX f2400000  EDX 00000006  ECX   (null)  EAX 00000001
      err   (null)  EIP c16209aa   CS 00000060  flg 00010002
 Stack: f2400000 00220000 f7200800 c1620613 00220000 01000000 04400000 00238000
          (null) f7200000 00000002 f7200b58 f7200800 c1620929 000375fe   (null)
        f7200b80 c16395f0 00200a02 f7200a80   (null) 000375fe 00000002   (null)
 Pid: 0, comm: swapper Not tainted 2.6.39-rc5-00181-g2706a0b #17
 Call Trace:
  [<c136b1e5>] ? early_fault+0x2e/0x2e
  [<c16209aa>] ? mminit_verify_page_links+0x12/0x42
  [<c1620613>] ? memmap_init_zone+0xaf/0x10c
  [<c1620929>] ? free_area_init_node+0x2b9/0x2e3
  [<c1607e99>] ? free_area_init_nodes+0x3f2/0x451
  [<c1601d80>] ? paging_init+0x112/0x118
  [<c15f578d>] ? setup_arch+0x791/0x82f
  [<c15f43d9>] ? start_kernel+0x6a/0x257

This patch implements node_map_pfn_alignment() which determines
maximum internode alignment and update numa_register_memblks() to
reject NUMA configuration if alignment exceeds the pfn -> nid mapping
granularity of the memory model as determined by PAGES_PER_SECTION.

This makes the problematic machine boot w/ flatmem by rejecting the
NUMA config and provides protection against crazy NUMA configurations.

Signed-off-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20110712074534.GB2872@htj.dyndns.org
LKML-Reference: <20110628174613.GP478@escobedo.osrc.amd.com>
Reported-and-Tested-by: Hans Rosenfeld <hans.rosenfeld@amd.com>
Cc: Conny Seidel <conny.seidel@amd.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-12 21:58:29 -07:00
Linus Torvalds
8d86e5f914 Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc/mm: Fix memory_block_size_bytes() for non-pseries
  mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header
2011-07-12 14:21:19 -07:00
Linus Torvalds
d93a881dd7 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/linux-arm-soc:
  pcmcia: pxa2xx/vpac270: free gpios on exist rather than requesting
  ARM: pxa/raumfeld: fix device name for codec ak4104
  ARM: pxa/raumfeld: display initialisation fixes
  ARM: pxa/raumfeld: adapt to upcoming hardware change
  ARM: pxa: fix gpio_to_chip() clash with gpiolib namespace
  genirq: replace irq_gc_ack() with {set,clr}_bit variants (fwd)
  arm: mach-vt8500: add forgotten irq_data conversion
  ARM: pxa168: correct nand pmu setting
  ARM: pxa910: correct nand pmu setting
  ARM: pxa: fix PGSR register address calculation
2011-07-12 14:19:51 -07:00
David S. Miller
e2270ea62a netdevice: Kill 'feature' test macros.
Almost all of these have long outstayed their welcome.

And for every one of these macros, there are 10 features for which we
didn't add macros.

Let's just delete them all, and get out of habit of doing things this
way.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2011-07-12 12:28:58 -07:00
Shaohua Li
383cd7213f CFQ: move think time check variables to a separate struct
Move the variables to do think time check to a sepatate struct. This is
to prepare adding think time check for service tree and group. No
functional change.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-12 14:24:35 +02:00
Gleb Natapov
e03b644fe6 KVM: introduce kvm_read_guest_cached
Introduce kvm_read_guest_cached() function in addition to write one we
already have.

[ by glauber: export function signature in kvm header ]

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Glauber Costa <glommer@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Tested-by: Eric Munson <emunson@mgebm.net>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12 13:17:01 +03:00
Paul Mackerras
aa04b4cc5b KVM: PPC: Allocate RMAs (Real Mode Areas) at boot for use by guests
This adds infrastructure which will be needed to allow book3s_hv KVM to
run on older POWER processors, including PPC970, which don't support
the Virtual Real Mode Area (VRMA) facility, but only the Real Mode
Offset (RMO) facility.  These processors require a physically
contiguous, aligned area of memory for each guest.  When the guest does
an access in real mode (MMU off), the address is compared against a
limit value, and if it is lower, the address is ORed with an offset
value (from the Real Mode Offset Register (RMOR)) and the result becomes
the real address for the access.  The size of the RMA has to be one of
a set of supported values, which usually includes 64MB, 128MB, 256MB
and some larger powers of 2.

Since we are unlikely to be able to allocate 64MB or more of physically
contiguous memory after the kernel has been running for a while, we
allocate a pool of RMAs at boot time using the bootmem allocator.  The
size and number of the RMAs can be set using the kvm_rma_size=xx and
kvm_rma_count=xx kernel command line options.

KVM exports a new capability, KVM_CAP_PPC_RMA, to signal the availability
of the pool of preallocated RMAs.  The capability value is 1 if the
processor can use an RMA but doesn't require one (because it supports
the VRMA facility), or 2 if the processor requires an RMA for each guest.

This adds a new ioctl, KVM_ALLOCATE_RMA, which allocates an RMA from the
pool and returns a file descriptor which can be used to map the RMA.  It
also returns the size of the RMA in the argument structure.

Having an RMA means we will get multiple KMV_SET_USER_MEMORY_REGION
ioctl calls from userspace.  To cope with this, we now preallocate the
kvm->arch.ram_pginfo array when the VM is created with a size sufficient
for up to 64GB of guest memory.  Subsequently we will get rid of this
array and use memory associated with each memslot instead.

This moves most of the code that translates the user addresses into
host pfns (page frame numbers) out of kvmppc_prepare_vrma up one level
to kvmppc_core_prepare_memory_region.  Also, instead of having to look
up the VMA for each page in order to check the page size, we now check
that the pages we get are compound pages of 16MB.  However, if we are
adding memory that is mapped to an RMA, we don't bother with calling
get_user_pages_fast and instead just offset from the base pfn for the
RMA.

Typically the RMA gets added after vcpus are created, which makes it
inconvenient to have the LPCR (logical partition control register) value
in the vcpu->arch struct, since the LPCR controls whether the processor
uses RMA or VRMA for the guest.  This moves the LPCR value into the
kvm->arch struct and arranges for the MER (mediated external request)
bit, which is the only bit that varies between vcpus, to be set in
assembly code when going into the guest if there is a pending external
interrupt request.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
Paul Mackerras
371fefd6f2 KVM: PPC: Allow book3s_hv guests to use SMT processor modes
This lifts the restriction that book3s_hv guests can only run one
hardware thread per core, and allows them to use up to 4 threads
per core on POWER7.  The host still has to run single-threaded.

This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
capability.  The return value of the ioctl querying this capability
is the number of vcpus per virtual CPU core (vcore), currently 4.

To use this, the host kernel should be booted with all threads
active, and then all the secondary threads should be offlined.
This will put the secondary threads into nap mode.  KVM will then
wake them from nap mode and use them for running guest code (while
they are still offline).  To wake the secondary threads, we send
them an IPI using a new xics_wake_cpu() function, implemented in
arch/powerpc/sysdev/xics/icp-native.c.  In other words, at this stage
we assume that the platform has a XICS interrupt controller and
we are using icp-native.c to drive it.  Since the woken thread will
need to acknowledge and clear the IPI, we also export the base
physical address of the XICS registers using kvmppc_set_xics_phys()
for use in the low-level KVM book3s code.

When a vcpu is created, it is assigned to a virtual CPU core.
The vcore number is obtained by dividing the vcpu number by the
number of threads per core in the host.  This number is exported
to userspace via the KVM_CAP_PPC_SMT capability.  If qemu wishes
to run the guest in single-threaded mode, it should make all vcpu
numbers be multiples of the number of threads per core.

We distinguish three states of a vcpu: runnable (i.e., ready to execute
the guest), blocked (that is, idle), and busy in host.  We currently
implement a policy that the vcore can run only when all its threads
are runnable or blocked.  This way, if a vcpu needs to execute elsewhere
in the kernel or in qemu, it can do so without being starved of CPU
by the other vcpus.

When a vcore starts to run, it executes in the context of one of the
vcpu threads.  The other vcpu threads all go to sleep and stay asleep
until something happens requiring the vcpu thread to return to qemu,
or to wake up to run the vcore (this can happen when another vcpu
thread goes from busy in host state to blocked).

It can happen that a vcpu goes from blocked to runnable state (e.g.
because of an interrupt), and the vcore it belongs to is already
running.  In that case it can start to run immediately as long as
the none of the vcpus in the vcore have started to exit the guest.
We send the next free thread in the vcore an IPI to get it to start
to execute the guest.  It synchronizes with the other threads via
the vcore->entry_exit_count field to make sure that it doesn't go
into the guest if the other vcpus are exiting by the time that it
is ready to actually enter the guest.

Note that there is no fixed relationship between the hardware thread
number and the vcpu number.  Hardware threads are assigned to vcpus
as they become runnable, so we will always use the lower-numbered
hardware threads in preference to higher-numbered threads if not all
the vcpus in the vcore are runnable, regardless of which vcpus are
runnable.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:57 +03:00
David Gibson
54738c0971 KVM: PPC: Accelerate H_PUT_TCE by implementing it in real mode
This improves I/O performance for guests using the PAPR
paravirtualization interface by making the H_PUT_TCE hcall faster, by
implementing it in real mode.  H_PUT_TCE is used for updating virtual
IOMMU tables, and is used both for virtual I/O and for real I/O in the
PAPR interface.

Since this moves the IOMMU tables into the kernel, we define a new
KVM_CREATE_SPAPR_TCE ioctl to allow qemu to create the tables.  The
ioctl returns a file descriptor which can be used to mmap the newly
created table.  The qemu driver models use them in the same way as
userspace managed tables, but they can be updated directly by the
guest with a real-mode H_PUT_TCE implementation, reducing the number
of host/guest context switches during guest IO.

There are certain circumstances where it is useful for userland qemu
to write to the TCE table even if the kernel H_PUT_TCE path is used
most of the time.  Specifically, allowing this will avoid awkwardness
when we need to reset the table.  More importantly, we will in the
future need to write the table in order to restore its state after a
checkpoint resume or migration.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:56 +03:00
Paul Mackerras
de56a948b9 KVM: PPC: Add support for Book3S processors in hypervisor mode
This adds support for KVM running on 64-bit Book 3S processors,
specifically POWER7, in hypervisor mode.  Using hypervisor mode means
that the guest can use the processor's supervisor mode.  That means
that the guest can execute privileged instructions and access privileged
registers itself without trapping to the host.  This gives excellent
performance, but does mean that KVM cannot emulate a processor
architecture other than the one that the hardware implements.

This code assumes that the guest is running paravirtualized using the
PAPR (Power Architecture Platform Requirements) interface, which is the
interface that IBM's PowerVM hypervisor uses.  That means that existing
Linux distributions that run on IBM pSeries machines will also run
under KVM without modification.  In order to communicate the PAPR
hypercalls to qemu, this adds a new KVM_EXIT_PAPR_HCALL exit code
to include/linux/kvm.h.

Currently the choice between book3s_hv support and book3s_pr support
(i.e. the existing code, which runs the guest in user mode) has to be
made at kernel configuration time, so a given kernel binary can only
do one or the other.

This new book3s_hv code doesn't support MMIO emulation at present.
Since we are running paravirtualized guests, this isn't a serious
restriction.

With the guest running in supervisor mode, most exceptions go straight
to the guest.  We will never get data or instruction storage or segment
interrupts, alignment interrupts, decrementer interrupts, program
interrupts, single-step interrupts, etc., coming to the hypervisor from
the guest.  Therefore this introduces a new KVMTEST_NONHV macro for the
exception entry path so that we don't have to do the KVM test on entry
to those exception handlers.

We do however get hypervisor decrementer, hypervisor data storage,
hypervisor instruction storage, and hypervisor emulation assist
interrupts, so we have to handle those.

In hypervisor mode, real-mode accesses can access all of RAM, not just
a limited amount.  Therefore we put all the guest state in the vcpu.arch
and use the shadow_vcpu in the PACA only for temporary scratch space.
We allocate the vcpu with kzalloc rather than vzalloc, and we don't use
anything in the kvmppc_vcpu_book3s struct, so we don't allocate it.
We don't have a shared page with the guest, but we still need a
kvm_vcpu_arch_shared struct to store the values of various registers,
so we include one in the vcpu_arch struct.

The POWER7 processor has a restriction that all threads in a core have
to be in the same partition.  MMU-on kernel code counts as a partition
(partition 0), so we have to do a partition switch on every entry to and
exit from the guest.  At present we require the host and guest to run
in single-thread mode because of this hardware restriction.

This code allocates a hashed page table for the guest and initializes
it with HPTEs for the guest's Virtual Real Memory Area (VRMA).  We
require that the guest memory is allocated using 16MB huge pages, in
order to simplify the low-level memory management.  This also means that
we can get away without tracking paging activity in the host for now,
since huge pages can't be paged or swapped.

This also adds a few new exports needed by the book3s_hv code.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-07-12 13:16:54 +03:00
Jan Kiszka
91e3d71db2 KVM: Clarify KVM_ASSIGN_PCI_DEVICE documentation
Neither host_irq nor the guest_msi struct are used anymore today.
Tag the former, drop the latter to avoid confusion.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-07-12 13:16:16 +03:00
Kevin Hilman
f3393b62f1 PM / Runtime: Add new helper function: pm_runtime_status_suspended()
This boolean function simply returns whether or not the runtime status
of the device is 'suspended'.  Unlike pm_runtime_suspended(), this
function returns the runtime status whether or not runtime PM for the
device has been disabled or not.

Also add entry to Documentation/power/runtime.txt

Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-12 11:17:09 +02:00
Justin TerAvest
4aede84b33 fixlet: Remove fs_excl from struct task.
fs_excl is a poor man's priority inheritance for filesystems to hint to
the block layer that an operation is important. It was never clearly
specified, not widely adopted, and will not prevent starvation in many
cases (like across cgroups).

fs_excl was introduced with the time sliced CFQ IO scheduler, to
indicate when a process held FS exclusive resources and thus needed
a boost.

It doesn't cover all file systems, and it was never fully complete.
Lets kill it.

Signed-off-by: Justin TerAvest <teravest@google.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-12 08:35:10 +02:00
Eric Dumazet
4915a0de43 net: introduce __netdev_alloc_skb_ip_align
RX rings should use GFP_KERNEL allocations if possible, add
__netdev_alloc_skb_ip_align() helper to ease this.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-11 20:08:34 -07:00
Benjamin Herrenschmidt
a63fdc5156 mm: Move definition of MIN_MEMORY_BLOCK_SIZE to a header
The macro MIN_MEMORY_BLOCK_SIZE is currently defined twice in two .c
files, and I need it in a third one to fix a powerpc bug, so let's
first move it into a header

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
2011-07-12 11:08:01 +10:00
Rafael J. Wysocki
c6d22b3726 PM / Domains: Allow callbacks to execute all runtime PM helpers
A deadlock may occur if one of the PM domains' .start_device() or
.stop_device() callbacks or a device driver's .runtime_suspend() or
.runtime_resume() callback executed by the core generic PM domain
code uses a "wrong" runtime PM helper function.  This happens, for
example, if .runtime_resume() from one device's driver calls
pm_runtime_resume() for another device in the same PM domain.
A similar situation may take place if a device's parent is in the
same PM domain, in which case the runtime PM framework may execute
pm_genpd_runtime_resume() automatically for the parent (if it is
suspended at the moment).  This, of course, is undesirable, so
the generic PM domains code should be modified to prevent it from
happening.

The runtime PM framework guarantees that pm_genpd_runtime_suspend()
and pm_genpd_runtime_resume() won't be executed in parallel for
the same device, so the generic PM domains code need not worry
about those cases.  Still, it needs to prevent the other possible
race conditions between pm_genpd_runtime_suspend(),
pm_genpd_runtime_resume(), pm_genpd_poweron() and pm_genpd_poweroff()
from happening and it needs to avoid deadlocks at the same time.
To this end, modify the generic PM domains code to relax
synchronization rules so that:

* pm_genpd_poweron() doesn't wait for the PM domain status to
  change from GPD_STATE_BUSY.  If it finds that the status is
  not GPD_STATE_POWER_OFF, it returns without powering the domain on
  (it may modify the status depending on the circumstances).

* pm_genpd_poweroff() returns as soon as it finds that the PM
  domain's status changed from GPD_STATE_BUSY after it's released
  the PM domain's lock.

* pm_genpd_runtime_suspend() doesn't wait for the PM domain status
  to change from GPD_STATE_BUSY after executing the domain's
  .stop_device() callback and executes pm_genpd_poweroff() only
  if pm_genpd_runtime_resume() is not executed in parallel.

* pm_genpd_runtime_resume() doesn't wait for the PM domain status
  to change from GPD_STATE_BUSY after executing pm_genpd_poweron()
  and sets the domain's status to GPD_STATE_BUSY and increments its
  counter of resuming devices (introduced by this change) immediately
  after acquiring the lock.  The counter of resuming devices is then
  decremented after executing __pm_genpd_runtime_resume() for the
  device and the domain's status is reset to GPD_STATE_ACTIVE (unless
  there are more resuming devices in the domain, in which case the
  status remains GPD_STATE_BUSY).

This way, for example, if a device driver's .runtime_resume()
callback executes pm_runtime_resume() for another device in the same
PM domain, pm_genpd_poweron() called by pm_genpd_runtime_resume()
invoked by the runtime PM framework will not block and it will see
that there's nothing to do for it.  Next, the PM domain's lock will
be acquired without waiting for its status to change from
GPD_STATE_BUSY and the device driver's .runtime_resume() callback
will be executed.  In turn, if pm_runtime_suspend() is executed by
one device driver's .runtime_resume() callback for another device in
the same PM domain, pm_genpd_poweroff() executed by
pm_genpd_runtime_suspend() invoked by the runtime PM framework as a
result will notice that one of the devices in the domain is being
resumed, so it will return immediately.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-12 00:39:36 +02:00
Rafael J. Wysocki
17b75eca76 PM / Domains: Do not execute device callbacks under locks
Currently, the .start_device() and .stop_device() callbacks from
struct generic_pm_domain() as well as the device drivers' runtime PM
callbacks used by the generic PM domains code are executed under
the generic PM domain lock.  This, unfortunately, is prone to
deadlocks, for example if a device and its parent are boths members
of the same PM domain.  For this reason, it would be better if the
PM domains code didn't execute device callbacks under the lock.

Rework the locking in the generic PM domains code so that the lock
is dropped for the execution of device callbacks.  To this end,
introduce PM domains states reflecting the current status of a PM
domain and such that the PM domain lock cannot be acquired if the
status is GPD_STATE_BUSY.  Make threads attempting to acquire a PM
domain's lock wait until the status changes to either
GPD_STATE_ACTIVE or GPD_STATE_POWER_OFF.

This change by itself doesn't fix the deadlock problem mentioned
above, but the mechanism introduced by it will be used for for this
purpose by a subsequent patch.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-12 00:39:29 +02:00
Steven Rostedt
04da85b861 ftrace: Fix warning when CONFIG_FUNCTION_TRACER is not defined
The struct ftrace_hash was declared within CONFIG_FUNCTION_TRACER
but was referenced outside of it.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-11 10:12:59 -04:00
Jiri Kosina
b7e9c223be Merge branch 'master' into for-next
Sync with Linus' tree to be able to apply pending patches that
are based on newer code already present upstream.
2011-07-11 14:15:55 +02:00
Daniel Baluta
d84e0bd797 skbuff: update struct sk_buff members comments
Rearrange struct sk_buff members comments to follow their
definition order. Also, add missing comments for ooo_okay
and dropcount members.

Signed-off-by: Daniel Baluta <dbaluta@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-10 07:04:04 -07:00
Andy Green
5a9aaf0cf8 I2C: OMAP1/OMAP2+: create omap I2C functionality flags for each cpu_... test
These represent the 8 kinds of implementation functionality
that up until now were inferred by the 16 remaining cpu_...()
tests in the omap i2c driver.

Changed to use BIT() as suggested by Balaji T Krishnamoorthy.

Cc: patches@linaro.org
Cc: Ben Dooks <ben-linux@fluff.org>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andy Green <andy.green@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
2011-07-10 05:27:15 -06:00
Andy Green
d72fe7883f I2C: OMAP2+: Introduce I2C IP versioning constants
These represent the two kinds of (incompatible) OMAP I2C
peripheral unit in use so far.

The constants are in linux/i2c-omap.h so the omap i2c driver can have
them too.

Cc: patches@linaro.org
Cc: Ben Dooks <ben-linux@fluff.org>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andy Green <andy.green@linaro.org>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Paul Walmsley <paul@pwsan.com>
2011-07-10 05:27:14 -06:00
Magnus Damm
18b4f3f5d0 PM / Domains: Export pm_genpd_poweron() in header
Allow SoC-specific code to call pm_genpd_poweron().

Signed-off-by: Magnus Damm <damm@opensource.se>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-10 10:39:14 +02:00
Wu Fengguang
1a12d8bd7b writeback: scale IO chunk size up to half device bandwidth
Originally, MAX_WRITEBACK_PAGES was hard-coded to 1024 because of a
concern of not holding I_SYNC for too long.  (At least, that was the
comment previously.)  This doesn't make sense now because the only
time we wait for I_SYNC is if we are calling sync or fsync, and in
that case we need to write out all of the data anyway.  Previously
there may have been other code paths that waited on I_SYNC, but not
any more.					    -- Theodore Ts'o

So remove the MAX_WRITEBACK_PAGES constraint. The writeback pages
will adapt to as large as the storage device can write within 500ms.

XFS is observed to do IO completions in a batch, and the batch size is
equal to the write chunk size. To avoid dirty pages to suddenly drop
out of balance_dirty_pages()'s dirty control scope and create large
fluctuations, the chunk size is also limited to half the control scope.

The balance_dirty_pages() control scrope is

	[(background_thresh + dirty_thresh) / 2, dirty_thresh]

which is by default [15%, 20%] of global dirty pages, whose range size
is dirty_thresh / DIRTY_FULL_SCOPE.

The adpative write chunk size will be rounded to the nearest 4MB
boundary.

http://bugzilla.kernel.org/show_bug.cgi?id=13930

CC: Theodore Ts'o <tytso@mit.edu>
CC: Dave Chinner <david@fromorbit.com>
CC: Chris Mason <chris.mason@oracle.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:03 -07:00
Wu Fengguang
ffd1f609ab writeback: introduce max-pause and pass-good dirty limits
The max-pause limit helps to keep the sleep time inside
balance_dirty_pages() within MAX_PAUSE=200ms. The 200ms max sleep means
per task rate limit of 8pages/200ms=160KB/s when dirty exceeded, which
normally is enough to stop dirtiers from continue pushing the dirty
pages high, unless there are a sufficient large number of slow dirtiers
(eg. 500 tasks doing 160KB/s will still sum up to 80MB/s, exceeding the
write bandwidth of a slow disk and hence accumulating more and more dirty
pages).

The pass-good limit helps to let go of the good bdi's in the presence of
a blocked bdi (ie. NFS server not responding) or slow USB disk which for
some reason build up a large number of initial dirty pages that refuse
to go away anytime soon.

For example, given two bdi's A and B and the initial state

	bdi_thresh_A = dirty_thresh / 2
	bdi_thresh_B = dirty_thresh / 2
	bdi_dirty_A  = dirty_thresh / 2
	bdi_dirty_B  = dirty_thresh / 2

Then A get blocked, after a dozen seconds

	bdi_thresh_A = 0
	bdi_thresh_B = dirty_thresh
	bdi_dirty_A  = dirty_thresh / 2
	bdi_dirty_B  = dirty_thresh / 2

The (bdi_dirty_B < bdi_thresh_B) test is now useless and the dirty pages
will be effectively throttled by condition (nr_dirty < dirty_thresh).
This has two problems:
(1) we lose the protections for light dirtiers
(2) balance_dirty_pages() effectively becomes IO-less because the
    (bdi_nr_reclaimable > bdi_thresh) test won't be true. This is good
    for IO, but balance_dirty_pages() loses an important way to break
    out of the loop which leads to more spread out throttle delays.

DIRTY_PASSGOOD_AREA can eliminate the above issues. The only problem is,
DIRTY_PASSGOOD_AREA needs to be defined as 2 to fully cover the above
example while this patch uses the more conservative value 8 so as not to
surprise people with too many dirty pages than expected.

The max-pause limit won't noticeably impact the speed dirty pages are
knocked down when there is a sudden drop of global/bdi dirty thresholds.
Because the heavy dirties will be throttled below 160KB/s which is slow
enough. It does help to avoid long dirty throttle delays and especially
will make light dirtiers more responsive.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:02 -07:00
Wu Fengguang
c42843f2f0 writeback: introduce smoothed global dirty limit
The start of a heavy weight application (ie. KVM) may instantly knock
down determine_dirtyable_memory() if the swap is not enabled or full.
global_dirty_limits() and bdi_dirty_limit() will in turn get global/bdi
dirty thresholds that are _much_ lower than the global/bdi dirty pages.

balance_dirty_pages() will then heavily throttle all dirtiers including
the light ones, until the dirty pages drop below the new dirty thresholds.
During this _deep_ dirty-exceeded state, the system may appear rather
unresponsive to the users.

About "deep" dirty-exceeded: task_dirty_limit() assigns 1/8 lower dirty
threshold to heavy dirtiers than light ones, and the dirty pages will
be throttled around the heavy dirtiers' dirty threshold and reasonably
below the light dirtiers' dirty threshold. In this state, only the heavy
dirtiers will be throttled and the dirty pages are carefully controlled
to not exceed the light dirtiers' dirty threshold. However if the
threshold itself suddenly drops below the number of dirty pages, the
light dirtiers will get heavily throttled.

So introduce global_dirty_limit for tracking the global dirty threshold
with policies

- follow downwards slowly
- follow up in one shot

global_dirty_limit can effectively mask out the impact of sudden drop of
dirtyable memory. It will be used in the next patch for two new type of
dirty limits. Note that the new dirty limits are not going to avoid
throttling the light dirtiers, but could limit their sleep time to 200ms.

Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:02 -07:00
Wu Fengguang
e98be2d599 writeback: bdi write bandwidth estimation
The estimation value will start from 100MB/s and adapt to the real
bandwidth in seconds.

It tries to update the bandwidth only when disk is fully utilized.
Any inactive period of more than one second will be skipped.

The estimated bandwidth will be reflecting how fast the device can
writeout when _fully utilized_, and won't drop to 0 when it goes idle.
The value will remain constant at disk idle time. At busy write time, if
not considering fluctuations, it will also remain high unless be knocked
down by possible concurrent reads that compete for the disk time and
bandwidth with async writes.

The estimation is not done purely in the flusher because there is no
guarantee for write_cache_pages() to return timely to update bandwidth.

The bdi->avg_write_bandwidth smoothing is very effective for filtering
out sudden spikes, however may be a little biased in long term.

The overheads are low because the bdi bandwidth update only occurs at
200ms intervals.

The 200ms update interval is suitable, because it's not possible to get
the real bandwidth for the instance at all, due to large fluctuations.

The NFS commits can be as large as seconds worth of data. One XFS
completion may be as large as half second worth of data if we are going
to increase the write chunk to half second worth of data. In ext4,
fluctuations with time period of around 5 seconds is observed. And there
is another pattern of irregular periods of up to 20 seconds on SSD tests.

That's why we are not only doing the estimation at 200ms intervals, but
also averaging them over a period of 3 seconds and then go further to do
another level of smoothing in avg_write_bandwidth.

CC: Li Shaohua <shaohua.li@intel.com>
CC: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:01 -07:00
Jan Kara
f7d2b1ecd0 writeback: account per-bdi accumulated written pages
Introduce the BDI_WRITTEN counter. It will be used for estimating the
bdi's write bandwidth.

Peter Zijlstra <a.p.zijlstra@chello.nl>:
Move BDI_WRITTEN accounting into __bdi_writeout_inc().
This will cover and fix fuse, which only calls bdi_writeout_inc().

CC: Michael Rubin <mrubin@google.com>
Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:01 -07:00
Wu Fengguang
d46db3d582 writeback: make writeback_control.nr_to_write straight
Pass struct wb_writeback_work all the way down to writeback_sb_inodes(),
and initialize the struct writeback_control there.

struct writeback_control is basically designed to control writeback of a
single file, but we keep abuse it for writing multiple files in
writeback_sb_inodes() and its callers.

It immediately clean things up, e.g. suddenly wbc.nr_to_write vs
work->nr_pages starts to make sense, and instead of saving and restoring
pages_skipped in writeback_sb_inodes it can always start with a clean
zero value.

It also makes a neat IO pattern change: large dirty files are now
written in the full 4MB writeback chunk size, rather than whatever
remained quota in wbc->nr_to_write.

Acked-by: Jan Kara <jack@suse.cz>
Proposed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
2011-07-09 22:09:01 -07:00
Jean-François Dagenais
f607e7fc5f w1: ds1wm: add a reset recovery parameter
This fixes a regression in 3.0 reported by Paul Parsons regarding the
removal of the msleep(1) in the ds1wm_reset() function:

: The linux-3.0-rc4 DS1WM 1-wire driver is logging "bus error, retrying"
: error messages on an HP iPAQ hx4700 PDA (XScale-PXA270):
:
: <snip>
: Driver for 1-wire Dallas network protocol.
: DS1WM w1 busmaster driver - (c) 2004 Szabolcs Gyurko
: 1-Wire driver for the DS2760 battery monitor  chip  - (c) 2004-2005, Szabolcs Gyurko
: ds1wm ds1wm: pass: 1 bus error, retrying
: ds1wm ds1wm: pass: 2 bus error, retrying
: ds1wm ds1wm: pass: 3 bus error, retrying
: ds1wm ds1wm: pass: 4 bus error, retrying
: ds1wm ds1wm: pass: 5 bus error, retrying
: ...
:
: The visible result is that the battery charging LED is erratic; sometimes
: it works, mostly it doesn't.
:
: The linux-2.6.39 DS1WM 1-wire driver worked OK.  I haven't tried 3.0-rc1,
: 3.0-rc2, or 3.0-rc3.

This sleep should not be required on normal circuitry provided the
pull-ups on the bus are correctly adapted to the slaves.  Unfortunately,
this is not always the case.  The sleep is restored but as a parameter to
the probe function in the pdata.

[akpm@linux-foundation.org: coding-style fixes]
Reported-by: Paul Parsons <lost.distance@yahoo.com>
Tested-by: Paul Parsons <lost.distance@yahoo.com>
Signed-off-by: Jean-François Dagenais <dagenaisj@sonatest.com>
Cc: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-08 21:14:44 -07:00
Greg Kroah-Hartman
18fbb93fbe Merge branch 'for-next' of master.kernel.org:/pub/scm/linux/kernel/git/balbi/usb into usb-next
* 'for-next' of master.kernel.org:/pub/scm/linux/kernel/git/balbi/usb:
  usb: gadget: m66592-udc: add pullup function
  usb: gadget: m66592-udc: add function for external controller
  usb: gadget: r8a66597-udc: add pullup function
  usb: gadget: zero: add superspeed support
  usb: gadget: add SS descriptors to Ethernet gadget
  usb: gadget: r8a66597-udc: add support for TEST_MODE
  usb: gadget: m66592-udc: add support for TEST_MODE
  usb: gadget: r8a66597-udc: Make BUSWAIT configurable through platform data
  usb: gadget: r8a66597-udc: fix cannot connect after rmmod gadget driver
  usb: update email address in r8a66597-udc and m66592-udc
  usb: musb: restore INDEX register in resume path
  usb: gadget: fix up depencies
  usb: gadget: fusb300_udc: fix compile warnings
  usb: gadget: ci13xx_udc.c: fix compile warning
  usb: gadget: net2272: fix compile warnings
  usb: gadget: langwell_udc: fix compile warnings
  usb: gadget: fusb300_udc: drop dead code
2011-07-08 15:30:55 -07:00
Shreshtha Kumar Sahu
def90f4239 amba pl011: workaround for uart registers lockup
This workaround aims to break the deadlock situation
which raises during continuous transfer of data for long
duration over uart with hardware flow control. It is
observed that CTS interrupt cannot be cleared in uart
interrupt register (ICR). Hence further transfer over
uart gets blocked.

It is seen that during such deadlock condition ICR
don't get cleared even on multiple write. This leads
pass_counter to decrease and finally reach zero. This
can be taken as trigger point to run this UART_BT_WA.

Workaround backups the register configuration, does soft
reset of UART using BIT-0 of PRCC_K_SOFTRST_SET/CLEAR
registers and restores the registers.

This patch also provides support for uart init and exit
function calls if present.

Signed-off-by: Shreshtha Kumar Sahu <shreshthakumar.sahu@stericsson.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-08 15:09:23 -07:00
Yoshihiro Shimoda
bb59dbff4e usb: gadget: m66592-udc: add function for external controller
M66592 has the pin of WR0 and WR1. So, if one write-pin of CPU
connects to the pins, we have to change the setting of FIFOSEL
register in the controller. If we don't change the setting,
the controller cannot send the data of odd length.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-07-09 01:08:39 +03:00
Yoshihiro Shimoda
45304e8cd9 usb: update email address in ohci-sh and r8a66597-hcd
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-08 14:57:12 -07:00
Yoshihiro Shimoda
f2e9039a43 usb: r8a66597-hcd: add function for external controller
R8A66597 has the pin of WR0 and WR1. So, if one write-pin of CPU
connects to the pins, we have to change the setting of FIFOSEL
register in the controller. If we don't change the setting,
the controller cannot send the data of odd length.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-08 14:57:11 -07:00
Michal Hocko
d8bf4ca9ca rcu: treewide: Do not use rcu_read_lock_held when calling rcu_dereference_check
Since ca5ecddf (rcu: define __rcu address space modifier for sparse)
rcu_dereference_check use rcu_read_lock_held as a part of condition
automatically so callers do not have to do that as well.

Signed-off-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-08 22:21:58 +02:00
Shawn Guo
b98c023920 dt: add empty of_property_read_u32[_array] for non-dt
The patch adds empty functions of_property_read_u32 and
of_property_read_u32_array for non-dt build, so that drivers
migrating to dt can save some '#ifdef CONFIG_OF'.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
[grant.likely: Moved things around so only one new static inline is needed]
[grant.likely: Added _string variant]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-08 12:52:33 -06:00
Luciano Coelho
1a84ff7564 cfg80211: return -ENOENT when stopping sched_scan while not running
If we try to stop a scheduled scan while it is not running, we should
return -ENOENT instead of simply ignoring the command and returning
success.  This is more consistent with other parts of the code.

Reported-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: Luciano Coelho <coelho@ti.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-08 11:47:54 -04:00
John W. Linville
204d1641d2 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-07-08 11:03:36 -04:00
Dima Zavin
732375c6a5 plist: Remove the need to supply locks to plist heads
This was legacy code brought over from the RT tree and
is no longer necessary.

Signed-off-by: Dima Zavin <dima@android.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Daniel Walker <dwalker@codeaurora.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Link: http://lkml.kernel.org/r/1310084879-10351-2-git-send-email-dima@android.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-08 14:02:53 +02:00
Yoshihiro Shimoda
5154e9f126 usb: gadget: r8a66597-udc: Make BUSWAIT configurable through platform data
BUSWAIT is a 4-bit-wide value that controls the number of access waits
from the CPU to on-chip USB module. b'0000 inserts 0 wait (2 access
cycles) and b'1111 inserts 15 waits (17 access cycles, hardware
initial value), respectively.

BUSWAIT value depends on peripheral clock frequency supplied to on-chip
of each CPU, hence should be configurable through platform data.

Note that this patch assumes that b'0000 (0 wait, 2 access cycles) is
rerely used and considered as invalid. If valid 'buswait' data is not
provided by platform, initial b'1111 (15 waits, 17 access cycles) will
be applied as a safe default.

Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
2011-07-08 12:47:42 +03:00
Shaohua Li
316cc67d5e block: document blk_plug list access
I'm often confused why not disable preempt when changing blk_plug list. It
would be better to add comments here in case others have the similar concerns.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-08 08:19:21 +02:00
Shaohua Li
55c022bbdd block: avoid building too big plug list
When I test fio script with big I/O depth, I found the total throughput drops
compared to some relative small I/O depth. The reason is the thread accumulates
big requests in its plug list and causes some delays (surely this depends
on CPU speed).
I thought we'd better have a threshold for requests. When a threshold reaches,
this means there is no request merge and queue lock contention isn't severe
when pushing per-task requests to queue, so the main advantages of blk plug
don't exist. We can force a plug list flush in this case.
With this, my test throughput actually increases and almost equals to small
I/O depth. Another side effect is irq off time decreases in blk_flush_plug_list()
for big I/O depth.
The BLK_MAX_REQUEST_COUNT is choosen arbitarily, but 16 is efficiently to
reduce lock contention to me. But I'm open here, 32 is ok in my test too.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-08 08:19:20 +02:00
Kumar Gala
a77ce8167c driver core: Add ability for arch code to setup pdev_archdata
On some architectures we need to setup pdev_archdata before we add the
device.  Waiting til a bus_notifier is too late since we might need the
pdev_archdata in the bus notifier.  One example is setting up of dma_mask
pointers such that it can be used in a bus_notifier.

We add weak noop version of arch_setup_pdev_archdata() and allow the arch
code to override with access the full definitions of struct device,
struct platform_device, and struct pdev_archdata.

Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-08 00:21:35 -05:00
Timur Tabi
6db7199407 drivers/virt: introduce Freescale hypervisor management driver
Add the drivers/virt directory, which houses drivers that support
virtualization environments, and add the Freescale hypervisor management
driver.

The Freescale hypervisor management driver provides several services to
drivers and applications related to the Freescale hypervisor:

1. An ioctl interface for querying and managing partitions

2. A file interface to reading incoming doorbells

3. An interrupt handler for shutting down the partition upon receiving the
   shutdown doorbell from a manager partition

4. A kernel interface for receiving callbacks when a managed partition
   shuts down.

Signed-off-by: Timur Tabi <timur@freescale.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
2011-07-08 00:21:27 -05:00
Keith Packard
bc67f799e7 Merge branch 'drm-intel-fixes' into drm-intel-next 2011-07-07 13:39:38 -07:00
Linus Torvalds
2a9d6df425 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  drbd: we should write meta data updates with FLUSH FUA
  drbd: fix limit define, we support 1 PiByte now
  drbd: when receive times out on meta socket, also check last receive time on data socket
  drbd: account bitmap IO during resync as resync-(related-)-io
  drbd: don't cond_resched_lock with IRQs disabled
  drbd: add missing spinlock to bitmap receive
  drbd: Use the correct max_bio_size when creating resync requests
  cfq-iosched: make code consistent
  cfq-iosched: fix a rcu warning
2011-07-07 13:22:26 -07:00
David Howells
c902ce1bfb FS-Cache: Add a helper to bulk uncache pages on an inode
Add an FS-Cache helper to bulk uncache pages on an inode.  This will
only work for the circumstance where the pages in the cache correspond
1:1 with the pages attached to an inode's page cache.

This is required for CIFS and NFS: When disabling inode cookie, we were
returning the cookie and setting cifsi->fscache to NULL but failed to
invalidate any previously mapped pages.  This resulted in "Bad page
state" errors and manifested in other kind of errors when running
fsstress.  Fix it by uncaching mapped pages when we disable the inode
cookie.

This patch should fix the following oops and "Bad page state" errors
seen during fsstress testing.

  ------------[ cut here ]------------
  kernel BUG at fs/cachefiles/namei.c:201!
  invalid opcode: 0000 [#1] SMP
  Pid: 5, comm: kworker/u:0 Not tainted 2.6.38.7-30.fc15.x86_64 #1 Bochs Bochs
  RIP: 0010: cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  RSP: 0018:ffff88002ce6dd00  EFLAGS: 00010282
  RAX: ffff88002ef165f0 RBX: ffff88001811f500 RCX: 0000000000000000
  RDX: 0000000000000000 RSI: 0000000000000100 RDI: 0000000000000282
  RBP: ffff88002ce6dda0 R08: 0000000000000100 R09: ffffffff81b3a300
  R10: 0000ffff00066c0a R11: 0000000000000003 R12: ffff88002ae54840
  R13: ffff88002ae54840 R14: ffff880029c29c00 R15: ffff88001811f4b0
  FS:  00007f394dd32720(0000) GS:ffff88002ef00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 00007fffcb62ddf8 CR3: 000000001825f000 CR4: 00000000000006e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
  Process kworker/u:0 (pid: 5, threadinfo ffff88002ce6c000, task ffff88002ce55cc0)
  Stack:
   0000000000000246 ffff88002ce55cc0 ffff88002ce6dd58 ffff88001815dc00
   ffff8800185246c0 ffff88001811f618 ffff880029c29d18 ffff88001811f380
   ffff88002ce6dd50 ffffffff814757e4 ffff88002ce6dda0 ffffffff8106ac56
  Call Trace:
   cachefiles_lookup_object+0x78/0xd4 [cachefiles]
   fscache_lookup_object+0x131/0x16d [fscache]
   fscache_object_work_func+0x1bc/0x669 [fscache]
   process_one_work+0x186/0x298
   worker_thread+0xda/0x15d
   kthread+0x84/0x8c
   kernel_thread_helper+0x4/0x10
  RIP  cachefiles_walk_to_object+0x436/0x745 [cachefiles]
  ---[ end trace 1d481c9af1804caa ]---

I tested the uncaching by the following means:

 (1) Create a big file on my NFS server (104857600 bytes).

 (2) Read the file into the cache with md5sum on the NFS client.  Look in
     /proc/fs/fscache/stats:

	Pages  : mrk=25601 unc=0

 (3) Open the file for read/write ("bash 5<>/warthog/bigfile").  Look in proc
     again:

	Pages  : mrk=25601 unc=25601

Reported-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-and-Tested-by: Suresh Jayaraman <sjayaraman@suse.de>
cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-07-07 13:21:56 -07:00
Linus Torvalds
27a3b735b7 Merge branches 'core-urgent-for-linus', 'perf-urgent-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  debugobjects: Fix boot crash when kmemleak and debugobjects enabled

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  jump_label: Fix jump_label update for modules
  oprofile, x86: Fix race in nmi handler while starting counters

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  sched: Disable (revert) SCHED_LOAD_SCALE increase
  sched, cgroups: Fix MIN_SHARES on 64-bit boxen
2011-07-07 13:17:45 -07:00
Ben Greear
d18a90dd85 slub: Add method to verify memory is not freed
This is for tracking down suspect memory usage.

Acked-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Ben Greear <greearb@candelatech.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-07-07 22:17:08 +03:00
Christoph Lameter
90810645f7 slab allocators: Provide generic description of alignment defines
Provide description for alignment defines.

Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
2011-07-07 21:04:12 +03:00
Simon Guinot
659fb32d1b genirq: replace irq_gc_ack() with {set,clr}_bit variants (fwd)
This fixes a regression introduced by e59347a "arm: orion:
Use generic irq chip".

Depending on the device, interrupts acknowledgement is done by setting
or by clearing a dedicated register. Replace irq_gc_ack() with some
{set,clr}_bit variants allows to handle both cases.

Note that this patch affects the following SoCs: Davinci, Samsung and
Orion. Except for this last, the change is minor: irq_gc_ack() is just
renamed into irq_gc_ack_set_bit().

For the Orion SoCs, the edge GPIO interrupts support is currently
broken. irq_gc_ack() try to acknowledge a such interrupt by setting
the corresponding cause register bit. The Orion GPIO device expect the
opposite. To fix this issue, the irq_gc_ack_clr_bit() variant is used.

Tested on Network Space v2.

Reported-by: Joey Oravec <joravec@drewtech.com>
Signed-off-by: Simon Guinot <sguinot@lacie.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2011-07-07 16:02:26 +00:00
Steven Rostedt
43dd61c9a0 ftrace: Fix regression of :mod:module function enabling
The new code that allows different utilities to pick and choose
what functions they trace broke the :mod: hook that allows users
to trace only functions of a particular module.

The reason is that the :mod: hook bypasses the hash that is setup
to allow individual users to trace their own functions and uses
the global hash directly. But if the global hash has not been
set up, it will cause a bug:

echo '*:mod:radeon' > /sys/kernel/debug/set_ftrace_filter

produces:

 [drm:drm_mode_getfb] *ERROR* invalid framebuffer id
 [drm:radeon_crtc_page_flip] *ERROR* failed to reserve new rbo buffer before flip
 BUG: unable to handle kernel paging request at ffffffff8160ec90
 IP: [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
 PGD 1a05067 PUD 1a09063 PMD 80000000016001e1
 Oops: 0003 [#1] SMP Jul  7 04:02:28 phyllis kernel: [55303.858604] CPU 1
 Modules linked in: cryptd aes_x86_64 aes_generic binfmt_misc rfcomm bnep ip6table_filter hid radeon r8169 ahci libahci mii ttm drm_kms_helper drm video i2c_algo_bit intel_agp intel_gtt

 Pid: 10344, comm: bash Tainted: G        WC  3.0.0-rc5 #1 Dell Inc. Inspiron N5010/0YXXJJ
 RIP: 0010:[<ffffffff810d9136>]  [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
 RSP: 0018:ffff88003a96bda8  EFLAGS: 00010246
 RAX: ffff8801301735c0 RBX: ffffffff8160ec80 RCX: 0000000000306ee0
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880137c92940
 RBP: ffff88003a96bdb8 R08: ffff880137c95680 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff81c9df78
 R13: ffff8801153d1000 R14: 0000000000000000 R15: 0000000000000000
 FS: 00007f329c18a700(0000) GS:ffff880137c80000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: ffffffff8160ec90 CR3: 000000003002b000 CR4: 00000000000006e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process bash (pid: 10344, threadinfo ffff88003a96a000, task ffff88012fcfc470)
 Stack:
  0000000000000fd0 00000000000000fc ffff88003a96be38 ffffffff810d92f5
  ffff88011c4c4e00 ffff880000000000 000000000b69f4d0 ffffffff8160ec80
  ffff8800300e6f06 0000000081130295 0000000000000282 ffff8800300e6f00
 Call Trace:
  [<ffffffff810d92f5>] match_records+0x155/0x1b0
  [<ffffffff810d940c>] ftrace_mod_callback+0xbc/0x100
  [<ffffffff810dafdf>] ftrace_regex_write+0x16f/0x210
  [<ffffffff810db09f>] ftrace_filter_write+0xf/0x20
  [<ffffffff81166e48>] vfs_write+0xc8/0x190
  [<ffffffff81167001>] sys_write+0x51/0x90
  [<ffffffff815c7e02>] system_call_fastpath+0x16/0x1b
 Code: 48 8b 33 31 d2 48 85 f6 75 33 49 89 d4 4c 03 63 08 49 8b 14 24 48 85 d2 48 89 10 74 04 48 89 42 08 49 89 04 24 4c 89 60 08 31 d2
 RIP [<ffffffff810d9136>] add_hash_entry+0x66/0xd0
  RSP <ffff88003a96bda8>
 CR2: ffffffff8160ec90
 ---[ end trace a5d031828efdd88e ]---

Reported-by: Brian Marete <marete@toshnix.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-07-07 11:30:08 -04:00
Michael Büsch
eb032b9837 Update my e-mail address
Signed-off-by: Michael Buesch <m@bues.ch>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-07-07 15:18:01 +02:00
Shirley Ma
a6686f2f38 skbuff: skb supports zero-copy buffers
This patch adds userspace buffers support in skb shared info. A new
struct skb_ubuf_info is needed to maintain the userspace buffers
argument and index, a callback is used to notify userspace to release
the buffers once lower device has done DMA (Last reference to that skb
has gone).

If there is any userspace apps to reference these userspace buffers,
then these userspaces buffers will be copied into kernel. This way we
can prevent userspace apps from holding these userspace buffers too long.

Use destructor_arg to point to the userspace buffer info; a new tx flags
SKBTX_DEV_ZEROCOPY is added for zero-copy buffer check.

Signed-off-by: Shirley Ma <xma@...ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-07 04:41:13 -07:00
Peter Ujfalusi
cfb7a33bea MFD: twl6040: Remove enum for PLL tracking
There is no need to have two different types for
tracking the selected PLL.
Use only the defines, when dealing with the PLLs.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-07-07 14:23:46 +03:00
Peter Ujfalusi
1b7c4725e2 MFD: twl6040: Remove wrapper for threaded irq request
Remove the twl6040_request_irq/free_irq inline functions,
and use direct calls instead in the core driver to
register the threaded irq.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Reviewed-by: Felipe Balbi <balbi@ti.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-07-07 14:23:35 +03:00
Ingo Molnar
b395fb36d5 Merge branch 'iommu-3.1' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu into core/iommu 2011-07-07 12:58:28 +02:00
Johannes Stezenbach
719c0c5906 compat_ioctl: fix make headers_check regression
Fix headers_check error introduced by 390192b300:

include/linux/fd.h:6: included file 'linux/compat.h' is not exported

Signed-off-by: Johannes Stezenbach <js@sig21.net>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-07 08:18:18 +02:00
Daniel Drake
cfee95977b x86, olpc: Add XO-1 RTC driver
Add a driver to configure the XO-1 RTC via CS5536 MSRs, to be used as a
system wakeup source via olpc-xo1-pm.

Device detection is based on finding the relevant device tree node.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/1309019658-1712-11-git-send-email-dsd@laptop.org
Acked-by: Andres Salomon <dilinger@queued.net>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: devicetree-discuss@lists.ozlabs.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-06 14:44:42 -07:00
Daniel Drake
2cf2baea10 x86, olpc-xo1-sci: Add lid switch functionality
Configure the XO-1's lid switch GPIO to trigger an SCI interrupt,
and correctly expose this input device which can be used as a wakeup
source.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/1309019658-1712-9-git-send-email-dsd@laptop.org
Acked-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-06 14:44:39 -07:00
Daniel Drake
7bc74b3df7 x86, olpc-xo1-sci: Add GPE handler and ebook switch functionality
The EC in the OLPC XO-1 delivers GPE events to provide various
notifications. Add the basic code for GPE/EC event processing and
enable the ebook switch, which can be used as a wakeup source.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/1309019658-1712-8-git-send-email-dsd@laptop.org
Acked-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-06 14:44:38 -07:00
Daniel Drake
7feda8e9f3 x86, olpc: Add XO-1 SCI driver and power button control
The System Control Interrupt is used in the OLPC XO-1 to control various
features of the laptop. Add the driver base and the power button
functionality.

This driver can't be built as a module, because functionality added in
future patches means that some drivers need to know at boot-time whether
SCI-based functionality is available.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/1309019658-1712-6-git-send-email-dsd@laptop.org
Acked-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-06 14:44:34 -07:00
Daniel Drake
97c4cb71c1 x86, olpc: Add XO-1 suspend/resume support
Add code needed for basic suspend/resume of the XO-1 laptop.
Based on earlier work by Jordan Crouse, Andres Salomon, and others.

This patch incorporates all earlier feedback from Thomas Gleixner. To
clarify a certain point (now more obvious in the code itself):
On resume, OpenFirmware returns execution to Linux in protected mode
with a kernel-compatible GDT already set up. The changes and
simplifications suggested have all been included.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/1309019658-1712-5-git-send-email-dsd@laptop.org
Acked-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-06 14:44:32 -07:00
Daniel Drake
7a0d4fcf6d x86, olpc: Move CS5536-related constants to cs5535.h
Move these definitions into the relevant header file.
This was requested in the review of the upcoming XO-1 suspend/resume code.

Signed-off-by: Daniel Drake <dsd@laptop.org>
Link: http://lkml.kernel.org/r/1309019658-1712-3-git-send-email-dsd@laptop.org
Acked-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-07-06 14:44:23 -07:00
Rob Herring
0e373639ad dt: add helper function to read u32 arrays
Rework of_property_read_u32 to read an array of values. Then
of_property_read_u32 becomes an inline with array size of 1.

Also make struct device_node ptr const.

Signed-off-by: Rob Herring <rob.herring@calxeda.com>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-06 14:58:09 -06:00
Johannes Berg
e5497d766a cfg80211/nl80211: support GTK rekey offload
In certain circumstances, like WoWLAN scenarios,
devices may implement (partial) GTK rekeying on
the device to avoid waking up the host for it.

In order to successfully go through GTK rekeying,
the KEK, KCK and the replay counter are required.

Add API to let the supplicant hand the parameters
to the driver which may store it for future GTK
rekey operations.

Note that, of course, if GTK rekeying is done by
the device, the EAP frame must not be passed up
to userspace, instead a rekey event needs to be
sent to let userspace update its replay counter.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-06 15:05:42 -04:00
David S. Miller
95ec3eb417 packet: Add 'cpu' fanout policy.
Unfortunately we have to use a real modulus here as
the multiply trick won't work as effectively with cpu
numbers as it does with rxhash values.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-06 01:56:38 -07:00
Shmulik Ravid
37cf4d1a9b dcbnl: Aggregated CEE GET operation
The following couple of patches add dcbnl an unsolicited notification of
the the DCB configuration for the CEE flavor of the DCBX protocol. This
is useful when the user-mode DCB client is not responsible for
conducting and resolving the DCBX negotiation (either because the DCBX
stack is embedded in the HW or the negotiation is handled by another
agent in the host), but still needs to get the negotiated parameters.
This functionality already exists for the IEEE flavor of the DCBX
protocol and these patches add it to the older CEE flavor.

The first patch extends the CEE attribute GET operation to include not
only the peer information, but also all the pertinent local
configuration (negotiated parameters). The second patch adds and export
a CEE specific notification routine.

Signed-off-by: Shmulik Ravid <shmulikr@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-05 23:42:17 -07:00
David S. Miller
e12fe68ce3 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6 2011-07-05 23:23:37 -07:00
David S. Miller
7736d33f42 packet: Add pre-defragmentation support for ipv4 fanouts.
The skb->rxhash cannot be properly computed if the
packet is a fragment.  To alleviate this, allow the
AF_PACKET client to ask for defragmentation to be
done at demux time.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-05 22:34:52 -07:00
David S. Miller
dc99f60069 packet: Add fanout support.
Fanouts allow packet capturing to be demuxed to a set of AF_PACKET
sockets.  Two fanout policies are implemented:

1) Hashing based upon skb->rxhash

2) Pure round-robin

An AF_PACKET socket must be fully bound before it tries to add itself
to a fanout.  All AF_PACKET sockets trying to join the same fanout
must all have the same bind settings.

Fanouts are identified (within a network namespace) by a 16-bit ID.
The first socket to try to add itself to a fanout with a particular
ID, creates that fanout.  When the last socket leaves the fanout
(which happens only when the socket is closed), that fanout is
destroyed.

Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-05 22:34:52 -07:00
David S. Miller
994635a137 Merge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 2011-07-05 18:22:39 -07:00
Lauro Ramos Venancio
23b7869c0f NFC: add the NFC socket raw protocol
This socket protocol is used to perform data exchange with NFC
targets.

Signed-off-by: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Signed-off-by: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-05 15:26:58 -04:00
Aloisio Almeida Jr
c7fe3b52c1 NFC: add NFC socket family
Signed-off-by: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Signed-off-by: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-05 15:26:58 -04:00
Lauro Ramos Venancio
4d12b8b129 NFC: add nfc generic netlink interface
The NFC generic netlink interface exports the NFC control operations
to the user space.

Signed-off-by: Lauro Ramos Venancio <lauro.venancio@openbossa.org>
Signed-off-by: Aloisio Almeida Jr <aloisio.almeida@openbossa.org>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-05 15:26:57 -04:00
Sergei Shtylyov
304e21bbea ssb: PCI revision ID register is 8-bit wide
The SSB code reads PCI revision ID register as 16-bit entity while the register
is actually 8-bit only (the next 8 bits are the programming interface register).
Fix the read and make the 'rev' field of 'struct ssb_boardinfo' 8-bit as well,
to match the register size.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-07-05 15:26:55 -04:00
Ingo Molnar
931da6137e Merge branch 'tip/perf/core-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core 2011-07-05 11:55:43 +02:00
Peter Zijlstra
e4c2fb0d57 sched: Disable (revert) SCHED_LOAD_SCALE increase
Alex reported that commit c8b281161d ("sched: Increase
SCHED_LOAD_SCALE resolution") caused a power usage regression
under light load as it increases the number of load-balance
operations and keeps idle cpus from staying idle.

Time has run out to find the root cause for this release so
disable the feature for v3.0 until we can figure out what
causes the problem.

Reported-by: "Alex, Shi" <alex.shi@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Nikhil Rao <ncrao@google.com>
Cc: Ming Lei <tom.leiming@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-m4onxn0sxnyn5iz9o88eskc3@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-05 11:28:18 +02:00
Linus Torvalds
aababb9766 Merge branch 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-3.x
* 'fbdev-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/lethal/fbdev-3.x:
  vesafb: fix memory leak
  fbdev: amba: Link fb device to its parent
  fsl-diu-fb: remove check for pixel clock ranges
  udlfb: Correct sub-optimal resolution selection.
  hecubafb: add module_put on error path in hecubafb_probe()
  sm501fb: fix section mismatch warning
  gx1fb: Fix section mismatch warnings
  fbdev: sh_mobile_meram: Correct pointer check for YCbCr chroma plane
2011-07-04 15:53:53 -07:00
Axel Castaneda Gonzalez
1fbe99529d ASoC: twl6040: Configure ramp step based on platform
Enable ramp down/up step to be configured based on
platform.

Signed-off-by: Axel Castaneda Gonzalez <x0055901@ti.com>
Signed-off-by: Misael Lopez Cruz <misael.lopez@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2011-07-04 19:36:29 +03:00
Misael Lopez Cruz
cc697d3839 input: Add initial support for TWL6040 vibrator
Add twl6040_vibra as a child of MFD device twl6040_codec. This
implementation covers the PCM-to-PWM mode of TWL6040 vibrator
module.

Signed-off-by: Misael Lopez Cruz <misael.lopez@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
CC: Tejun Heo <tj@kernel.org>
Acked-by: Dmitry Torokhov <dtor@mail.ru>
2011-07-04 19:36:18 +03:00
Misael Lopez Cruz
f19b2823f8 mfd: twl6040: Add initial support
TWL6040 IC provides analog high-end audio codec functions for
handset applications. It contains several audio analog inputs
and outputs as well as vibrator support. It's connected to the
host processor via PDM interface for audio data communication.
The audio modules are controlled by internal registers that
can be accessed by I2C and PDM interface.

TWL6040 MFD will be registered as a child of TWL-CORE, and will
have two children of its own: twl6040-codec and twl6040-vibra.

This driver is based on TWL4030 and WM8350 MFD drivers.

Signed-off-by: Misael Lopez Cruz <misael.lopez@ti.com>
Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-07-04 19:34:37 +03:00
Peter Ujfalusi
4ae6df5e10 MFD: twl4030-audio: Rename platform data
Allign the platform data names for twl4030 audio submodule:
twl4030_audio_data: for the core MFD driver
twl4030_codec_data: for ASoC codec driver
twl4030_vibra_data: for the input/ForceFeedback driver

To avoid breakage, change all depending drivers, files
to use the new types.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-07-04 18:44:02 +03:00
Peter Ujfalusi
57fe7251f5 MFD: twl4030-codec -> twl4030-audio: Rename the driver
Rename the driver, and header file from twl4030-codec to
twl4030-audio.
To avoid breakage change depending drivers at the same time.

Signed-off-by: Peter Ujfalusi <peter.ujfalusi@ti.com>
CC: Misael Lopez Cruz <misael.lopez@ti.com>
Acked-by: Samuel Ortiz <sameo@linux.intel.com>
2011-07-04 18:43:56 +03:00
Shawn Guo
f09bc831b7 dt: add 'const' for of_property_read_string parameter **out_string
The existing dt codes usually call of_get_property to get a string
property and save it as a 'const char *'.  The patch adds'const' for
of_property_read_string parameter **out_string to make the converting
of existing code a little easier.

Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-07-03 23:47:22 -06:00
Joe Perches
234b921dbc netpoll: Remove unused EXPORT_SYMBOLs of netpoll_poll and netpoll_poll_dev
Unused symbols waste space.

Commit 0e34e93177
"(netpoll: add generic support for bridge and bonding devices)"
added the symbol more than a year ago with the promise of "future use".

Because it is so far unused, remove it for now.
It can be easily readded if or when it actually needs to be used.

cc: WANG Cong <amwang@redhat.com>
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-07-03 20:02:07 -07:00
Rafael J. Wysocki
3d5c30367c PM: Rename clock management functions
The common PM clock management functions may be used for system
suspend/resume as well as for runtime PM, so rename them
accordingly.  Modify kerneldoc comments describing these functions
and kernel messages printed by them, so that they refer to power
management in general rather that to runtime PM.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Kevin Hilman <khilman@ti.com>
2011-07-02 14:29:57 +02:00
Rafael J. Wysocki
b7b95920aa PM: Allow the clocks management code to be used during system suspend
The common clocks management code in drivers/base/power/clock_ops.c
is going to be used during system-wide power transitions as well as
for runtime PM, so it shouldn't depend on CONFIG_PM_RUNTIME.
However, the suspend/resume functions provided by it for
CONFIG_PM_RUNTIME unset, to be used during system-wide power
transitions, should not behave in the same way as their counterparts
defined for CONFIG_PM_RUNTIME set, because in that case the clocks
are managed differently at run time.

The names of the functions still contain the word "runtime" after
this change, but that is going to be modified by a separate patch
later.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Kevin Hilman <khilman@ti.com>
2011-07-02 14:29:56 +02:00
Rafael J. Wysocki
d4f2d87a8b PM / Domains: Wakeup devices support for system sleep transitions
There is the problem how to handle devices set up to wake up the
system from sleep states during system-wide power transitions.
In some cases, those devices can be turned off entirely, because the
wakeup signals will be generated on their behalf anyway.  In some
other cases, they will generate wakeup signals if their clocks are
stopped, but only if power is not removed from them.  Finally, in
some cases, they can only generate wakeup signals if power is not
removed from them and their clocks are enabled.

To allow platform-specific code to decide whether or not to put
wakeup devices (and their PM domains) into low-power state during
system-wide transitions, such as system suspend, introduce a new
generic PM domain callback, .active_wakeup(), that will be used
during the "noirq" phase of system suspend and hibernation (after
image creation) to decide what to do with wakeup devices.
Specifically, if this callback is present and returns "true", the
generic PM domain code will not execute .stop_device() for the
given wakeup device and its PM domain won't be powered off.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Kevin Hilman <khilman@ti.com>
2011-07-02 14:29:56 +02:00
Rafael J. Wysocki
596ba34bcd PM / Domains: System-wide transitions support for generic domains (v5)
Make generic PM domains support system-wide power transitions
(system suspend and hibernation).  Add suspend, resume, freeze, thaw,
poweroff and restore callbacks to be associated with struct
generic_pm_domain objects and make pm_genpd_init() use them as
appropriate.

The new callbacks do nothing for devices belonging to power domains
that were powered down at run time (before the transition).  For the
other devices the action carried out depends on the type of the
transition.  During system suspend the power domain .suspend()
callback executes pm_generic_suspend() for the device, while the
PM domain .suspend_noirq() callback runs pm_generic_suspend_noirq()
for it, stops it and eventually removes power from the PM domain it
belongs to (after all devices in the domain have been stopped and its
subdomains have been powered off).

During system resume the PM domain .resume_noirq() callback
restores power to the PM domain (when executed for it first time),
starts the device and executes pm_generic_resume_noirq() for it,
while the .resume() callback executes pm_generic_resume() for the
device.  Finally, the .complete() callback executes pm_runtime_idle()
for the device which should put it back into the suspended state if
its runtime PM usage count is equal to zero at that time.

The actions carried out during hibernation and resume from it are
analogous to the ones described above.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Reviewed-by: Kevin Hilman <khilman@ti.com>
2011-07-02 14:29:56 +02:00
Rafael J. Wysocki
e529192883 PM: Introduce generic "noirq" callback routines for subsystems (v2)
Introduce generic "noirq" power management callback routines for
subsystems in addition to the "regular" generic PM callback routines.

The new routines will be used, among other things, for implementing
system-wide PM transitions support for generic PM domains.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-02 14:29:55 +02:00
Rafael J. Wysocki
f721889ff6 PM / Domains: Support for generic I/O PM domains (v8)
Introduce common headers, helper functions and callbacks allowing
platforms to use simple generic power domains for runtime power
management.

Introduce struct generic_pm_domain to be used for representing
power domains that each contain a number of devices and may be
parent domains or subdomains with respect to other power domains.
Among other things, this structure includes callbacks to be
provided by platforms for performing specific tasks related to
power management (i.e. ->stop_device() may disable a device's
clocks, while ->start_device() may enable them, ->power_off() is
supposed to remove power from the entire power domain
and ->power_on() is supposed to restore it).

Introduce functions that can be used as power domain runtime PM
callbacks, pm_genpd_runtime_suspend() and pm_genpd_runtime_resume(),
as well as helper functions for the initialization of a power
domain represented by a struct generic_power_domain object,
adding a device to or removing a device from it and adding or
removing subdomains.

Introduce configuration option CONFIG_PM_GENERIC_DOMAINS to be
selected by the platforms that want to use the new code.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Greg Kroah-Hartman <gregkh@suse.de>
Reviewed-by: Kevin Hilman <khilman@ti.com>
2011-07-02 14:29:55 +02:00
Rafael J. Wysocki
dc6e4e56e6 PM: subsys_data in struct dev_pm_info need not depend on RM_RUNTIME
The subsys_data field of struct dev_pm_info, introduced by commit
1d2b71f61b (PM / Runtime: Add subsystem
data field to struct dev_pm_info), is going to be used even if
CONFIG_PM_RUNTIME is not set, so move it from under the #ifdef.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
2011-07-02 14:29:54 +02:00
Rafael J. Wysocki
564b905ab1 PM / Domains: Rename struct dev_power_domain to struct dev_pm_domain
The naming convention used by commit 7538e3db6e015e890825fbd9f86599b
(PM: Add support for device power domains), which introduced the
struct dev_power_domain type for representing device power domains,
evidently confuses some developers who tend to think that objects
of this type must correspond to "power domains" as defined by
hardware, which is not the case.  Namely, at the kernel level, a
struct dev_power_domain object can represent arbitrary set of devices
that are mutually dependent power management-wise and need not belong
to one hardware power domain.  To avoid that confusion, rename struct
dev_power_domain to struct dev_pm_domain and rename the related
pointers in struct device and struct pm_clk_notifier_block from
pwr_domain to pm_domain.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Kevin Hilman <khilman@ti.com>
2011-07-02 14:29:54 +02:00
J Freyensee
8168e9c2de PTI feature to allow user to name and mark masterchannel request.
This feature addition provides a new parameter in
pti_request_masterchannel() to allow the user
to provide their own name to mark the request when
the trace is viewed in a PTI SW trace viewer
(like MPTA).  If a name is not provided and
NULL is provided, the 'current' process name is used.
API function header documentation documents this.

Signed-off-by: Jeremy Rocher <rocher.jeremy@gmail.com>
Signed-off-by: J Freyensee <james_p_freyensee@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-01 15:39:38 -07:00
Russ Gorby
bcd5abe28f tty: n_gsm: Add raw-ip support
This patch adds the ability to open a network data connection over a mux
virtual tty channel. This is for modems that support data connections
with raw IP frames instead of PPP. On high speed data connections this
eliminates a significant amount of PPP overhead. To use this interface,
the application must first tell the modem to open a network connection on
a virtual tty. Once that has been accomplished, the app will issue an
IOCTL on that virtual tty to create the network interface. The IOCTL will
return the index of the interface created.

The two IOCTL commands are:

    ioctl( fd, GSMIOC_ENABLE_NET );

    ioctl( fd, GSMIOC_DISABLE_NET );

Signed-off-by: Russ Gorby <russ.gorby@intel.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-01 15:34:45 -07:00
Sebastian Andrzej Siewior
352c2dc8b0 usb: gadget: udc-core: add "new-style" registration interface
udc_start() should only trigger the internal state machine and make
minimal house keeping. Before that call udc-core calls the bind()
callback and after the callback the pullup().

udc_stop() is simillar, udc-core calls pullup(), unbind() and finally
udc_stop().

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-01 14:31:13 -07:00
Sergei Shtylyov
e9c23a255a usb: gadget: add missing #include's
When #include'd alone, <linux/usb/gadget.h>
causes a lot of compilation errors and warnings
-- all because it relies on the including code to
bring in the necessary #include's instead of
doing this itself.

Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-01 14:31:05 -07:00
Tatyana Brokhman
bdb64d7272 usb: gadget: add SuperSpeed support to the Gadget Framework
SuperSpeed USB has defined a new descriptor, called
the Binary Device Object Store (BOS) Descriptor. It
has also changed a bit the definition of SET_FEATURE
and GET_STATUS requests to add USB3-specific details.

This patch implements both changes to the Composite
Gadget Framework.

[ balbi@ti.com : slight changes to commit log
		 fixed a compile error on ARM ]

Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-01 14:27:05 -07:00
Tatyana Brokhman
35a0e0bf6f usb: gadget: add max_speed to usb_composite_driver
This field is used by the Gadget drivers to specify
the maximum speed they support, meaning: the maximum
speed they can provide descriptors for.

The driver speed will be set in consideration of this
value.

[ balbi@ti.com : dropped the ifdeffery ]

Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-07-01 14:27:05 -07:00
Johannes Stezenbach
390192b300 compat_ioctl: fix warning caused by qemu
On Linux x86_64 host with 32bit userspace, running
qemu or even just "qemu-img create -f qcow2 some.img 1G"
causes a kernel warning:

ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(00005326){t:'S';sz:0} arg(7fffffff) on some.img
ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(801c0204){t:02;sz:28} arg(fff77350) on some.img

ioctl 00005326 is CDROM_DRIVE_STATUS,
ioctl 801c0204 is FDGETPRM.

The warning appears because the Linux compat-ioctl handler for these
ioctls only applies to block devices, while qemu also uses the ioctls on
plain files.

Signed-off-by: Johannes Stezenbach <js@sig21.net>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-01 22:32:26 +02:00
Tejun Heo
85ef06d1d2 block: flush MEDIA_CHANGE from drivers on close(2)
Currently, only open(2) is defined as the 'clearing' point.  It has
two roles - first, it's an acknowledgement from userland indicating
that the event has been received and kernel can clear pending states
and proceed to generate more events.  Secondly, it's passed on to
device drivers as a hint indicating that a synchronization point has
been reached and it might want to take a deeper look at the device.

The latter currently is only used by sr which uses two different
mechanisms - GET_EVENT_MEDIA_STATUS_NOTIFICATION and TEST_UNIT_READY
to discover events, where the former is lighter weight and safe to be
used repeatedly but may not provide full coverage.  Among other
things, GET_EVENT can't detect media removal while TUR can.

This patch makes close(2) - blkdev_put() - indicate clearing hint for
MEDIA_CHANGE to drivers.  disk_check_events() is renamed to
disk_flush_events() and updated to take @mask for events to flush
which is or'd to ev->clearing and will be passed to the driver on the
next ->check_events() invocation.

This change makes sr generate MEDIA_CHANGE when media is ejected from
userland - e.g. with eject(1).

Note: Given the current usage, it seems @clearing hint is needlessly
complex.  disk_clear_events() can simply clear all events and the hint
can be boolean @flush.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-01 16:17:47 +02:00
Jens Axboe
04bf7869ca Merge branch 'for-linus' into for-3.1/core
Conflicts:
	block/blk-throttle.c
	block/cfq-iosched.c

Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2011-07-01 16:17:13 +02:00
Ingo Molnar
1ecc818c51 Merge branch 'sched/core-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing into sched/core 2011-07-01 13:20:51 +02:00
Avi Kivity
26ca5c11fb perf: export perf_event_refresh() to modules
KVM needs one-shot samples, since a PMC programmed to -X will fire after X
events and then again after 2^40 events (i.e. variable period).

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1309362157-6596-4-git-send-email-avi@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:40 +02:00
Avi Kivity
4dc0da8696 perf: Add context field to perf_event
The perf_event overflow handler does not receive any caller-derived
argument, so many callers need to resort to looking up the perf_event
in their local data structure.  This is ugly and doesn't scale if a
single callback services many perf_events.

Fix by adding a context parameter to perf_event_create_kernel_counter()
(and derived hardware breakpoints APIs) and storing it in the perf_event.
The field can be accessed from the callback as event->overflow_handler_context.
All callers are updated.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1309362157-6596-2-git-send-email-avi@redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:38 +02:00
Peter Zijlstra
89d6c0b5bd perf, arch: Add generic NODE cache events
Add a NODE level to the generic cache events which is used to measure
local vs remote memory accesses. Like all other cache events, an
ACCESS is HIT+MISS, if there is no way to distinguish between reads
and writes do reads only etc..

The below needs filling out for !x86 (which I filled out with
unsupported events).

I'm fairly sure ARM can leave it like that since it doesn't strike me as
an architecture that even has NUMA support. SH might have something since
it does appear to have some NUMA bits.

Sparc64, PowerPC and MIPS certainly want a good look there since they
clearly are NUMA capable.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Miller <davem@davemloft.net>
Cc: Anton Blanchard <anton@samba.org>
Cc: David Daney <ddaney@caviumnetworks.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1303508226.4865.8.camel@laptop
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:38 +02:00
Stephane Eranian
efc9f05df2 perf_events: Update Intel extra regs shared constraints management
This patch improves the code managing the extra shared registers
used for offcore_response events on Intel Nehalem/Westmere. The
idea is to use static allocation instead of dynamic allocation.
This simplifies greatly the get and put constraint routines for
those events.

The patch also renames per_core to shared_regs because the same
data structure gets used whether or not HT is on. When HT is
off, those events still need to coordination because they use
a extra MSR that has to be shared within an event group.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20110606145703.GA7258@quad
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:36 +02:00
Peter Zijlstra
a7ac67ea02 perf: Remove the perf_output_begin(.sample) argument
Since only samples call perf_output_sample() its much saner (and more
correct) to put the sample logic in there than in the
perf_output_begin()/perf_output_end() pair.

Saves a useless argument, reduces conditionals and shrinks
struct perf_output_handle, win!

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/n/tip-2crpvsx3cqu67q3zqjbnlpsc@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:35 +02:00
Peter Zijlstra
a8b0ca17b8 perf: Remove the nmi parameter from the swevent and overflow interface
The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

  - hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
    the PMI-tail (ARM etc.)
  - tracepoint: nmi=0; since tracepoint could be from NMI context.
  - software: nmi=[0,1]; some, like the schedule thing cannot
    perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Michael Cree <mcree@orcon.net.nz>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Don Zickus <dzickus@redhat.com>
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:35 +02:00
Richard Kennedy
28009ce4a8 perf: Remove 64-bit alignment padding from perf_event_context
Reorder perf_event_context to remove 8 bytes of 64 bit alignment padding
shrinking its size to 192 bytes, allowing it to fit into a smaller slab
and use one fewer cache lines.

Signed-off-by: Richard Kennedy <richard@rsk.demon.co.uk>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1307460819.1950.5.camel@castor.rsk
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 11:06:32 +02:00
Thomas Gleixner
01898e3e29 i8253: Cleanup outb/inb magic
Remove the hysterical outb/inb_pit defines and use outb_p/inb_p in the
code.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20110609130622.348437125@linutronix.de
2011-07-01 10:37:15 +02:00
Thomas Gleixner
e6220bdc94 i8253: Create common clockevent implementation
arm, mips and x86 implement i8253 based clockevents. All the same code
copied. Create a common implementation in drivers/clocksource/i8253.c.

About time to rename drivers/clocksource/ to something else.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/20110609130621.921710458@linutronix.de
2011-07-01 10:37:14 +02:00
Ingo Molnar
10e6962765 Merge commit 'v3.0-rc5' into perf/core
Merge reason: Pick up the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-07-01 10:28:46 +02:00
Amit Kumar Salecha
0209bcd4d9 net: add external loopback test in ethtool self test
External loopback test can be performed by application without any driver
support on normal Ethernet cards.
But on CNA devices, where multiple functions share same physical port.
Here internal loopback test and external loopback test can be initiated by
multiple functions at same time. To co exist all functions, firmware need
to regulate what test can be run by which function. So before performing external
loopback test, command need to send to firmware, which will quiescent other functions.

User may not want to run external loopback test always. As special cable need to be
connected for this test.
So adding explicit flag in ethtool self test, which will specify interface
to perform external loopback test.
 ETH_TEST_FL_EXTERNAL_LB: Application set to request external loopback test
 ETH_TEST_FL_EXTERNAL_LB_DONE: Driver ack if test performed

Signed-off-by: Amit Kumar Salecha <amit.salecha@qlogic.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-30 22:32:49 -07:00
Thomas Abraham
a3b853633d dt: add helper functions to read u32 and string property values
Add helper functions to retrieve unsigned integer and string property
values from properties of a device node. These helper functions can be
used to lookup a property in a device node, perform error checking and
read the property value.

[grant.likely@secretlab.ca: Proposal and initial implementation]
Signed-off-by: Thomas Abraham <thomas.abraham@linaro.org>
[grant.likely: some word smithing and be more defensive validating the string]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-30 13:02:10 -06:00
John W. Linville
df2cbe4075 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem 2011-06-30 13:34:06 -04:00
Mr Dash Four
131ad62d8f netfilter: add SELinux context support to AUDIT target
In this revision the conversion of secid to SELinux context and adding it
to the audit log is moved from xt_AUDIT.c to audit.c with the aid of a
separate helper function - audit_log_secctx - which does both the conversion
and logging of SELinux context, thus also preventing internal secid number
being leaked to userspace. If conversion is not successful an error is raised.

With the introduction of this helper function the work done in xt_AUDIT.c is
much more simplified. It also opens the possibility of this helper function
being used by other modules (including auditd itself), if desired. With this
addition, typical (raw auditd) output after applying the patch would be:

type=NETFILTER_PKT msg=audit(1305852240.082:31012): action=0 hook=1 len=52 inif=? outif=eth0 saddr=10.1.1.7 daddr=10.1.2.1 ipid=16312 proto=6 sport=56150 dport=22 obj=system_u:object_r:ssh_client_packet_t:s0
type=NETFILTER_PKT msg=audit(1306772064.079:56): action=0 hook=3 len=48 inif=eth0 outif=? smac=00:05:5d:7c:27:0b dmac=00:02:b3:0a:7f:81 macproto=0x0800 saddr=10.1.2.1 daddr=10.1.1.7 ipid=462 proto=6 sport=22 dport=3561 obj=system_u:object_r:ssh_server_packet_t:s0

Acked-by: Eric Paris <eparis@redhat.com>
Signed-off-by: Mr Dash Four <mr.dash.four@googlemail.com>
Signed-off-by: Patrick McHardy <kaber@trash.net>
2011-06-30 13:31:57 +02:00
Jens Axboe
7b28afe01a Merge branch 'for-3.0-important' of git://git.drbd.org/linux-2.6-drbd into for-linus 2011-06-30 10:10:50 +02:00
Lars Ellenberg
15b493d11f drbd: fix limit define, we support 1 PiByte now
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
2011-06-30 09:23:45 +02:00
Mike Christie
f457a46f17 [SCSI] iscsi_ibft, be2iscsi, iscsi_boot: fix boot kobj data lifetime management
be2iscsi passes the boot functions its phba object which is
allocated in the shost, but iscsi_ibft passes in a object
allocated for each item to display. The problem is that
iscsi_boot_sysfs was managing the lifetime of the object
passed in and doing a kfree on release. This causes a double
free for be2iscsi which frees the shost in its pci_remove.

This patch fixes the problem by adding a release callback
which the drivers can call kfree or a put() type of function
(needed for be2iscsi which will do a get/put on the shost).

Signed-off-by: Mike Christie <michaelc@cs.wisc.edu>
Signed-off-by: James Bottomley <JBottomley@Parallels.com>
2011-06-29 16:43:06 -05:00
Keith Packard
8eb2c0ee67 Merge branch 'drm-intel-fixes' into drm-intel-next 2011-06-29 10:34:54 -07:00
John Bonesio
a6b0919140 of/gpio: Add new method for getting gpios under different property names
This patch adds a new routine, of_get_named_gpio_flags(), which takes the
property name as a parameter rather than assuming "gpios".

of_get_gpio_flags() is modified to call of_get_named_gpio_flags() with "gpios"
as the property parameter.

Signed-off-by: John Bonesio <bones@secretlab.ca>
[grant.likely: Tidied up whitespace and tweaked kerneldoc comments.]
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-28 15:39:50 -06:00
Jesse Barnes
3d73710880 cpufreq: expose a cpufreq_quick_get_max routine
This allows drivers and other code to get the max reported CPU frequency.
Initial use is for scaling ring frequency with GPU frequency in the i915
driver.

Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Signed-off-by: Keith Packard <keithp@keithp.com>
2011-06-28 13:54:26 -07:00
Tatyana Brokhman
a59d6b91cb usb: gadget: add streams support to the gadget framework
This patch defines necessary fields to support
streaming for USB3.0.

It implements a new function, called
usb_ep_autoconfig_ss(), to be used instead of the
existing usb_ep_autoconfig() when working in
SuperSpeed mode and there is a need to search for
an endpoint according to the number of required
streams.

[ balbi@ti.com : slight changes to commit log ]

Signed-off-by: Maya Erez <merez@codeaurora.org>
Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-28 11:20:15 -07:00
Linus Torvalds
c89b857ce6 Merge branch 'driver-core-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6
* 'driver-core-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6:
  Connector: Correctly set the error code in case of success when dispatching receive callbacks
  Connector: Set the CN_NETLINK_USERS correctly
  pti: PTI semantics fix in pti_tty_cleanup.
  pti: ENXIO error case memory leak PTI fix.
  pti: double-free security PTI fix
  drivers:misc: ti-st: fix skipping of change remote baud
  drivers/base/platform.c: don't mark platform_device_register_resndata() as __init_or_module
  st_kim: Handle case of no device found for ID 0
  firmware: fix GOOGLE_SMI kconfig dependency warning
2011-06-28 11:15:36 -07:00
Linus Torvalds
04b905942b Merge branch 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6
* 'tty-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty-2.6:
  serial: bcm63xx_uart: fix irq storm after rx fifo overrun.
  amba pl011: platform data for reg lockup and glitch v2
  amba pl011: workaround for uart registers lockup
  tty: n_gsm: improper skb_pull() use was leaking framed data
  tty: n_gsm: Fixed logic to decode break signal from modem status
  TTY: ntty, add one more sanity check
  TTY: ldisc, do not close until there are readers
  8250: Fix capabilities when changing the port type
  8250_pci: Fix missing const from merges
  ARM: SAMSUNG: serial: Fix on handling of one clock source for UART
  serial: ioremap warning fix for jsm driver.
  8250_pci: add -ENODEV code for Intel EG20T PCH
2011-06-28 11:14:55 -07:00
Tatyana Brokhman
ea2a1df7b2 usb: gadget: use config_ep_by_speed() instead of ep_choose()
Remove obsolete functions:
1. ep_choose()
2. usb_find_endpoint()

Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-28 11:14:37 -07:00
Tatyana Brokhman
48767a4e82 usb: gadget: configure endpoint according to gadget speed
Add config_ep_by_speed() to configure the endpoint
according to the gadget speed.

Using this function will spare the FDs from handling
the endpoint chosen descriptor.

Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-28 11:14:36 -07:00
Tatyana Brokhman
72c973dd2b usb: gadget: add usb_endpoint_descriptor to struct usb_ep
Change usb_ep_enable() prototype to use endpoint
descriptor from usb_ep.

This optimization spares the FDs from saving the
endpoint chosen descriptor. This optimization is
not full though. To fully exploit this change, one
needs to update all the UDCs as well since in the
current implementation each of them saves the
endpoint descriptor in it's internal (and extended)
endpoint structure.

Signed-off-by: Tatyana Brokhman <tlinder@codeaurora.org>
Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-28 11:14:36 -07:00
Felipe Balbi
2ccea03a8f usb: gadget: introduce UDC Class
this class will be used to abstract away several of the duplicated
operations scattered among the USB gadget controller drivers.

Later, we can add an atomic notifier to tell interested drivers about
what's happening with the controller. Notifications such as suspend,
resume, enumerated, etc. will be useful, at a minimum, for implementing
usb charger detection.

As part of the converting process usb_gadget_probe_driver() is no longer
part of each udc but pushed into the ->stap() callback. The same for his
couterpart.

The core is currently set explicit to 'n'. It will be changed to 'y' once
all users are converted since it provides functions which clash with
other drivers.

Signed-off-by: Felipe Balbi <balbi@ti.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Michal Nazarewicz <mina86@mina86.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2011-06-28 11:12:51 -07:00
Vitaliy Ivanov
e1f91f82b8 treewide: fix kernel-doc warnings
Fix 'make htmldocs' warnings:

Warning(/include/linux/hrtimer.h:153): No description found for
parameter 'clockid'
Warning(/include/linux/device.h:604): Excess struct/union/enum/typedef
member 'of_match' description in 'device'
Warning(/include/net/sock.h:349): Excess struct/union/enum/typedef
member 'sk_rmem_alloc' description in 'sock'

Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2011-06-28 10:48:34 +02:00
Jan Kara
08142579b6 mm: fix assertion mapping->nrpages == 0 in end_writeback()
Under heavy memory and filesystem load, users observe the assertion
mapping->nrpages == 0 in end_writeback() trigger.  This can be caused by
page reclaim reclaiming the last page from a mapping in the following
race:

	CPU0				CPU1
  ...
  shrink_page_list()
    __remove_mapping()
      __delete_from_page_cache()
        radix_tree_delete()
					evict_inode()
					  truncate_inode_pages()
					    truncate_inode_pages_range()
					      pagevec_lookup() - finds nothing
					  end_writeback()
					    mapping->nrpages != 0 -> BUG
        page->mapping = NULL
        mapping->nrpages--

Fix the problem by doing a reliable check of mapping->nrpages under
mapping->tree_lock in end_writeback().

Analyzed by Jay <jinshan.xiong@whamcloud.com>, lost in LKML, and dug out
by Miklos Szeredi <mszeredi@suse.de>.

Cc: Jay <jinshan.xiong@whamcloud.com>
Cc: Miklos Szeredi <mszeredi@suse.de>
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:13 -07:00
Chris Metcalf
507c5f1224 include/linux/compat.h: declare compat_sys_sendmmsg()
This is required for tilegx to be able to use the compat unistd.h header
where compat_sys_sendmmsg() is now mentioned.

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:13 -07:00
Hugh Dickins
d9d90e5eb7 tmpfs: add shmem_read_mapping_page_gfp
Although it is used (by i915) on nothing but tmpfs, read_cache_page_gfp()
is unsuited to tmpfs, because it inserts a page into pagecache before
calling the filesystem's ->readpage: tmpfs may have pages in swapcache
which only it knows how to locate and switch to filecache.

At present tmpfs provides a ->readpage method, and copes with this by
copying pages; but soon we can simplify it by removing its ->readpage.
Provide shmem_read_mapping_page_gfp() now, ready for that transition,

Export shmem_read_mapping_page_gfp() and add it to list in shmem_fs.h,
with shmem_read_mapping_page() inline for the common mapping_gfp case.

(shmem_read_mapping_page_gfp or shmem_read_cache_page_gfp? Generally the
read_mapping_page functions use the mapping's ->readpage, and the
read_cache_page functions use the supplied filler, so I think
read_cache_page_gfp was slightly misnamed.)

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:12 -07:00
Hugh Dickins
94c1e62df4 tmpfs: take control of its truncate_range
2.6.35's new truncate convention gave tmpfs the opportunity to control
its file truncation, no longer enforced from outside by vmtruncate().
We shall want to build upon that, to handle pagecache and swap together.

Slightly redefine the ->truncate_range interface: let it now be called
between the unmap_mapping_range()s, with the filesystem responsible for
doing the truncate_inode_pages_range() from it - just as the filesystem
is nowadays responsible for doing that from its ->setattr.

Let's rename shmem_notify_change() to shmem_setattr().  Instead of
calling the generic truncate_setsize(), bring that code in so we can
call shmem_truncate_range() - which will later be updated to perform its
own variant of truncate_inode_pages_range().

Remove the punch_hole unmap_mapping_range() from shmem_truncate_range():
now that the COW's unmap_mapping_range() comes after ->truncate_range,
there is no need to call it a third time.

Export shmem_truncate_range() and add it to the list in shmem_fs.h, so
that i915_gem_object_truncate() can call it explicitly in future; get
this patch in first, then update drm/i915 once this is available (until
then, i915 will just be doing the truncate_inode_pages() twice).

Though introduced five years ago, no other filesystem is implementing
->truncate_range, and its only other user is madvise(,,MADV_REMOVE): we
expect to convert it to fallocate(,FALLOC_FL_PUNCH_HOLE,,) shortly,
whereupon ->truncate_range can be removed from inode_operations -
shmem_truncate_range() will help i915 across that transition too.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:12 -07:00
Hugh Dickins
072441e21d mm: move shmem prototypes to shmem_fs.h
Before adding any more global entry points into shmem.c, gather such
prototypes into shmem_fs.h.  Remove mm's own declarations from swap.h,
but for now leave the ones in mm.h: because shmem_file_setup() and
shmem_zero_setup() are called from various places, and we should not
force other subsystems to update immediately.

Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 18:00:12 -07:00
Vitaliy Ivanov
4d258b25d9 Fix some kernel-doc warnings
Fix 'make htmldocs' warnings:

  Warning(/include/linux/hrtimer.h:153): No description found for parameter 'clockid'
  Warning(/include/linux/device.h:604): Excess struct/union/enum/typedef member 'of_match' description in 'device'
  Warning(/include/net/sock.h:349): Excess struct/union/enum/typedef member 'sk_rmem_alloc' description in 'sock'

Signed-off-by: Vitaliy Ivanov <vitalivanov@gmail.com>
Acked-by: Grant Likely <grant.likely@secretlab.ca>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 16:06:19 -07:00
Suresh Siddha
192d885742 x86, mtrr: use stop_machine APIs for doing MTRR rendezvous
MTRR rendezvous sequence is not implemened using stop_machine() before, as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).

Now that we have a new stop_machine_from_inactive_cpu() API, use it for
rendezvous during mtrr init of a logical processor that is coming online.

For the rest (runtime MTRR modification, system boot, resume paths), use
stop_machine() to implement the rendezvous sequence. This will consolidate and
cleanup the code.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110623182057.076997177@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-27 15:17:13 -07:00
Tejun Heo
f740e6cd0c stop_machine: implement stop_machine_from_inactive_cpu()
Currently, mtrr wants stop_machine functionality while a CPU is being
brought up.  As stop_machine() requires the calling CPU to be active,
mtrr implements its own stop_machine using stop_one_cpu() on each
online CPU.  This doesn't only unnecessarily duplicate complex logic
but also introduces a possibility of deadlock when it races against
the generic stop_machine().

This patch implements stop_machine_from_inactive_cpu() to serve such
use cases.  Its functionality is basically the same as stop_machine();
however, it should be called from a CPU which isn't active and doesn't
depend on working scheduling on the calling CPU.

This is achieved by using busy loops for synchronization and
open-coding stop_cpus queuing and waiting with direct invocation of
fn() for local CPU inbetween.

Signed-off-by: Tejun Heo <tj@kernel.org>
Link: http://lkml.kernel.org/r/20110623182056.982526827@sbsiddha-MOBL3.sc.intel.com
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-27 15:17:08 -07:00
Jamie Iles
06c3df4952 clocksource: apb: Share APB timer code with other platforms
The APB timers are an IP block from Synopsys (DesignWare APB timers)
and are also found in other systems including ARM SoC's.  This patch
adds functions for creating clock_event_devices and clocksources from
APB timers but does not do the resource allocation.  This is handled
in a higher layer to allow the timers to be created from multiple
methods such as platform_devices.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Jacob Pan <jacob.jun.pan@linux.intel.com>
Signed-off-by: Jamie Iles <jamie@jamieiles.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
2011-06-27 15:16:21 -07:00
Linus Torvalds
a64227b085 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc:
  mmc: queue: bring discard_granularity/alignment into line with SCSI
  mmc: queue: append partition subname to queue thread name
  mmc: core: make erase timeout calculation allow for gated clock
  mmc: block: switch card to User Data Area when removing the block driver
  mmc: sdio: reset card during power_restore
  mmc: cb710: fix #ifdef HAVE_EFFICIENT_UNALIGNED_ACCESS
  mmc: sdhi: DMA slave ID 0 is invalid
  mmc: tmio: fix regression in TMIO_MMC_WRPROTECT_DISABLE handling
  mmc: omap_hsmmc: use original sg_len for dma_unmap_sg
  mmc: omap_hsmmc: fix ocr mask usage
  mmc: sdio: fix runtime PM path during driver removal
  mmc: Add PCI fixup quirks for Ricoh 1180:e823 reader
  mmc: sdhi: fix module unloading
  mmc: of_mmc_spi: add NO_IRQ define to of_mmc_spi.c
  mmc: vub300: fix null dereferences in error handling
2011-06-27 14:55:43 -07:00
KAMEZAWA Hiroyuki
c6830c2260 Fix node_start/end_pfn() definition for mm/page_cgroup.c
commit 21a3c96 uses node_start/end_pfn(nid) for detection start/end
of nodes. But, it's not defined in linux/mmzone.h but defined in
/arch/???/include/mmzone.h which is included only under
CONFIG_NEED_MULTIPLE_NODES=y.

Then, we see
  mm/page_cgroup.c: In function 'page_cgroup_init':
  mm/page_cgroup.c:308: error: implicit declaration of function 'node_start_pfn'
  mm/page_cgroup.c:309: error: implicit declaration of function 'node_end_pfn'

So, fixiing page_cgroup.c is an idea...

But node_start_pfn()/node_end_pfn() is a very generic macro and
should be implemented in the same manner for all archs.
(m32r has different implementation...)

This patch removes definitions of node_start/end_pfn() in each archs
and defines a unified one in linux/mmzone.h. It's not under
CONFIG_NEED_MULTIPLE_NODES, now.

A result of macro expansion is here (mm/page_cgroup.c)

for !NUMA
 start_pfn = ((&contig_page_data)->node_start_pfn);
  end_pfn = ({ pg_data_t *__pgdat = (&contig_page_data); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

for NUMA (x86-64)
  start_pfn = ((node_data[nid])->node_start_pfn);
  end_pfn = ({ pg_data_t *__pgdat = (node_data[nid]); __pgdat->node_start_pfn + __pgdat->node_spanned_pages;});

Changelog:
 - fixed to avoid using "nid" twice in node_end_pfn() macro.

Reported-and-acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Reported-and-tested-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-27 14:13:09 -07:00
Suresh Siddha
6d3321e8e2 x86, mtrr: lock stop machine during MTRR rendezvous sequence
MTRR rendezvous sequence using stop_one_cpu_nowait() can potentially
happen in parallel with another system wide rendezvous using
stop_machine(). This can lead to deadlock (The order in which
works are queued can be different on different cpu's. Some cpu's
will be running the first rendezvous handler and others will be running
the second rendezvous handler. Each set waiting for the other set to join
for the system wide rendezvous, leading to a deadlock).

MTRR rendezvous sequence is not implemented using stop_machine() as this
gets called both from the process context aswell as the cpu online paths
(where the cpu has not come online and the interrupts are disabled etc).
stop_machine() works with only online cpus.

For now, take the stop_machine mutex in the MTRR rendezvous sequence that
gets called from an online cpu (here we are in the process context
and can potentially sleep while taking the mutex). And the MTRR rendezvous
that gets triggered during cpu online doesn't need to take this stop_machine
lock (as the stop_machine() already ensures that there is no cpu hotplug
going on in parallel by doing get_online_cpus())

    TBD: Pursue a cleaner solution of extending the stop_machine()
         infrastructure to handle the case where the calling cpu is
         still not online and use this for MTRR rendezvous sequence.

fixes: https://bugzilla.novell.com/show_bug.cgi?id=672008

Reported-by: Vadim Kotelnikov <vadimuzzz@inbox.ru>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Link: http://lkml.kernel.org/r/20110623182056.807230326@sbsiddha-MOBL3.sc.intel.com
Cc: stable@kernel.org # 2.6.35+, backport a week or two after this gets more testing in mainline
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-06-27 14:00:46 -07:00
Johannes Berg
04b7dcf979 wireless: unify QoS control field definitions
Move all that mac80211 has into the generic
ieee80211.h header file and use them. At the
same time move them from mask+shift to just
bits and rename them for consistent names.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-27 15:09:39 -04:00
Oleg Nesterov
0347e17739 ptrace: ptrace_reparented() should check same_thread_group()
ptrace_reparented() naively does parent != real_parent, this means
it returns true even if the tracer _is_ the real parent. This is per
process thing, not per-thread. The only reason ->real_parent can
point to the non-leader thread is that we have __WNOTHREAD.

Change it to check !same_thread_group(parent, real_parent).

It has two callers, and in both cases the current check does not
look right.

exit_notify: we should respect ->exit_signal if the exiting leader
is traced by any thread from the parent thread group. It is the
child of the whole group, and we are going to send the signal to
the whole group.

wait_task_zombie: without __WNOTHREAD do_wait() should do the same
for any thread, only sys_ptrace() is "bound" to the single thread.
However do_wait(WEXITED) succeeds but does not release a traced
natural child unless the caller is the tracer.

Test-case:

	void *tfunc(void *arg)
	{
		assert(ptrace(PTRACE_ATTACH, (long)arg, 0,0) == 0);
		pause();
		return NULL;
	}

	int main(void)
	{
		pthread_t thr;
		pid_t pid, stat, ret;

		pid = fork();
		if (!pid) {
			pause();
			assert(0);
		}

		assert(pthread_create(&thr, NULL, tfunc, (void*)(long)pid) == 0);

		assert(waitpid(-1, &stat, 0) == pid);
		assert(WIFSTOPPED(stat));

		kill(pid, SIGKILL);

		assert(waitpid(-1, &stat, 0) == pid);
		assert(WIFSIGNALED(stat) && WTERMSIG(stat) == SIGKILL);

		ret = waitpid(pid, &stat, 0);
		if (ret < 0)
			return 0;

		printf("WTF? %d is dead, but: wait=%d stat=%x\n",
				pid, ret, stat);

		return 1;
	}

Note that the main thread simply does

	pid = fork();
	kill(pid, SIGKILL);

and then without the patch wait4(WEXITED) succeeds twice and reports
WTERMSIG(stat) == SIGKILL.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:10 +02:00
Oleg Nesterov
087806b128 redefine thread_group_leader() as exit_signal >= 0
Change de_thread() to set old_leader->exit_signal = -1. This is
good for the consistency, it is no longer the leader and all
sub-threads have exit_signal = -1 set by copy_process(CLONE_THREAD).

And this allows us to micro-optimize thread_group_leader(), it can
simply check exit_signal >= 0. This also makes sense because we
should move ->group_leader from task_struct to signal_struct.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:10 +02:00
Oleg Nesterov
e550f14dc6 kill task_detached()
Upadate the last user of task_detached(), wait_task_zombie(), to
use thread_group_leader() and kill task_detached().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:09 +02:00
Oleg Nesterov
8677347378 make do_notify_parent() __must_check, update the callers
Change other callers of do_notify_parent() to check the value it
returns, this makes the subsequent task_detached() unnecessary.
Mark do_notify_parent() as __must_check.

Use thread_group_leader() instead of !task_detached() to check
if we need to notify the real parent in wait_task_zombie().

Remove the stale comment in release_task(). "just for sanity" is
no longer true, we have to set EXIT_DEAD to avoid the races with
do_wait().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:09 +02:00
Oleg Nesterov
45cdf5cc07 kill tracehook_notify_death()
Kill tracehook_notify_death(), reimplement the logic in its caller,
exit_notify().

Also, change the exec_id's check to use thread_group_leader() instead
of task_detached(), this is more clear. This logic only applies to
the exiting leader, a sub-thread must never change its exit_signal.

Note: when the traced group leader exits the exit_signal-or-SIGCHLD
logic looks really strange:

	- we notify the tracer even if !thread_group_empty() but
	   do_wait(WEXITED) can't work until all threads exit

	- if the tracer is real_parent, it is not clear why can't
	  we use ->exit_signal event if !thread_group_empty()

-v2: do not try to fix the 2nd oddity to avoid the subtle behavior
     change mixed with reorganization, suggested by Tejun.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Reviewed-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:08 +02:00
Oleg Nesterov
53c8f9f199 make do_notify_parent() return bool
- change do_notify_parent() to return a boolean, true if the task should
  be reaped because its parent ignores SIGCHLD.

- update the only caller which checks the returned value, exit_notify().

This temporary uglifies exit_notify() even more, will be cleanuped by
the next change.

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
2011-06-27 20:30:08 +02:00
Jan Kara
bb189247f3 jbd: Fix oops in journal_remove_journal_head()
journal_remove_journal_head() can oops when trying to access journal_head
returned by bh2jh(). This is caused for example by the following race:

	TASK1					TASK2
  journal_commit_transaction()
    ...
    processing t_forget list
      __journal_refile_buffer(jh);
      if (!jh->b_transaction) {
        jbd_unlock_bh_state(bh);
					journal_try_to_free_buffers()
					  journal_grab_journal_head(bh)
					  jbd_lock_bh_state(bh)
					  __journal_try_to_free_buffer()
					  journal_put_journal_head(jh)
        journal_remove_journal_head(bh);

journal_put_journal_head() in TASK2 sees that b_jcount == 0 and buffer is not
part of any transaction and thus frees journal_head before TASK1 gets to doing
so. Note that even buffer_head can be released by try_to_free_buffers() after
journal_put_journal_head() which adds even larger opportunity for oops (but I
didn't see this happen in reality).

Fix the problem by making transactions hold their own journal_head reference
(in b_jcount). That way we don't have to remove journal_head explicitely via
journal_remove_journal_head() and instead just remove journal_head when
b_jcount drops to zero. The result of this is that [__]journal_refile_buffer(),
[__]journal_unfile_buffer(), and __journal_remove_checkpoint() can free
journal_head which needs modification of a few callers. Also we have to be
careful because once journal_head is removed, buffer_head might be freed as
well. So we have to get our own buffer_head reference where it matters.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-06-27 11:44:37 +02:00
Akinobu Mita
9008593017 ext3: use proper little-endian bitops
ext3_{set,clear}_bit() is defined as __test_and_{set,clear}_bit_le()
for ext3.  But all ext3_{set,clear}_bit() calls ignore return values.
So these can be replaced with __{set,clear}_bit_le().

This changes ext3_{set,clear}_bit safely, because if someone uses
these macros without noticing the change, new ext3_{set,clear}_bit
don't have return value and causes compiler errors where the return
value is used.

This also removes unused ext3_find_first_zero_bit().

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andreas Dilger <adilger.kernel@dilger.ca>
Cc: linux-ext4@vger.kernel.org
Signed-off-by: Jan Kara <jack@suse.cz>
2011-06-25 17:29:52 +02:00
Petr Uzel
fbcc9e624b ext2: include fs.h into ext2_fs.h
AC_CHECK_HEADERS([linux/ext2_fs.h])
fails with

configure:34666: checking linux/ext2_fs.h usability
configure:34666: gcc -std=gnu99 -c -ggdb3 -O0 -Wunreachable-code  conftest.c >&5
In file included from conftest.c:406:0:
/usr/include/linux/ext2_fs.h: In function 'ext2_mask_flags':
/usr/include/linux/ext2_fs.h:182:21: error: 'FS_DIRSYNC_FL' undeclared (first use in this function)
/usr/include/linux/ext2_fs.h:182:21: note: each undeclared identifier is reported only once for each function it appears in
/usr/include/linux/ext2_fs.h:182:37: error: 'FS_TOPDIR_FL' undeclared (first use in this function)
/usr/include/linux/ext2_fs.h:184:19: error: 'FS_NODUMP_FL' undeclared (first use in this function)
/usr/include/linux/ext2_fs.h:184:34: error: 'FS_NOATIME_FL' undeclared (first use in this function)

It's reasonable to have headers that include all necessary definitions. So fix
this by including fs.h into ext2_fs.h.

Signed-off-by: Petr Uzel <petr.uzel@suse.cz>
Signed-off-by: Jan Kara <jack@suse.cz>
2011-06-25 17:29:52 +02:00
Jan Kara
40680f2fa4 ext3: Convert ext3 to new truncate calling convention
Mostly trivial conversion. We fix a bug that IS_IMMUTABLE and IS_APPEND files
could not be truncated during failed writes as we change the code.  In fact the
test is not needed at all because both IS_IMMUTABLE and IS_APPEND is tested in
upper layers in do_sys_[f]truncate(), may_write(), etc.

Signed-off-by: Jan Kara <jack@suse.cz>
2011-06-25 17:29:51 +02:00
John W. Linville
36099365c7 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next-2.6 into for-davem
Conflicts:
	drivers/net/wireless/rtlwifi/pci.c
	include/linux/netlink.h
2011-06-24 15:25:51 -04:00
Linus Torvalds
5220cc9382 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
* 'for-linus' of git://git.kernel.dk/linux-block:
  block: add REQ_SECURE to REQ_COMMON_MASK
  block: use the passed in @bdev when claiming if partno is zero
  block: Add __attribute__((format(printf...) and fix fallout
  block: make disk_block_events() properly wait for work cancellation
  block: remove non-syncing __disk_block_events() and fold it into disk_block_events()
  block: don't use non-syncing event blocking in disk_check_events()
  cfq-iosched: fix locking around ioc->ioc_data assignment
2011-06-24 08:42:35 -07:00
Timur Tabi
39785eb1d3 fsl-diu-fb: remove check for pixel clock ranges
The Freescale DIU framebuffer driver defines two constants, MIN_PIX_CLK and
MAX_PIX_CLK, that are supposed to represent the lower and upper limits of
the pixel clock.  These values, however, are true only for one platform
clock rate (533MHz) and only for the MPC8610.  So the actual range for
the pixel clock is chip-specific, which means the current values are almost
always wrong.  The chance of an out-of-range pixel clock being used are also
remote.

Rather than try to detect an out-of-range clock in the DIU driver, we depend
on the board-specific pixel clock function (e.g. p1022ds_set_pixel_clock)
to clamp the pixel clock to a supported value.

Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Paul Mundt <lethal@linux-sh.org>
2011-06-24 17:08:49 +09:00
David Wagner
01dc9cc314 UBI: clarify the volume notification types' doc
I realized the new descriptions of ADDED and REMOVED could also be
misleading: they can also be triggered after using a userland util
(ubi{mk,rm}vol).

Artem: amend the commentaries

Signed-off-by David Wagner <david.wagner@free-electrons.com>
Signed-off-by: Artem Bityutskiy <dedekind1@gmail.com>
2011-06-23 17:40:07 +03:00
Linus Torvalds
68d0080f1e Merge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6
* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PCI / PM: Block races between runtime PM and system sleep
  PM / Domains: Update documentation
  PM / Runtime: Handle clocks correctly if CONFIG_PM_RUNTIME is unset
  PM: Fix async resume following suspend failure
  PM: Free memory bitmaps if opening /dev/snapshot fails
  PM: Rename dev_pm_info.in_suspend to is_prepared
  PM: Update documentation regarding sysdevs
  PM / Runtime: Update doc: usage count no longer incremented across system PM
2011-06-22 21:08:52 -07:00
Frederic Weisbecker
d902db1eb6 sched: Generalize sleep inside spinlock detection
The sleeping inside spinlock detection is actually used
for more general sleeping inside atomic sections
debugging: preemption disabled, rcu read side critical
sections, interrupts, interrupt disabled, etc...

Change the name of the config and its help section to
reflect its more general role.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Ingo Molnar <mingo@elte.hu>
2011-06-23 00:44:38 +02:00
Thomas Gleixner
a7de915383 genirq: Remove unused CHECK_IRQ_PER_CPU()
No more users. Kill it.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2011-06-22 22:55:02 +02:00
Gabor Juhos
7d95847c9b ath9k: add external_reset callback to ath9k_platfom_data for AR9330
The patch adds a callback to ath9k_platform_data. If the
callback is provided by the platform code, then it can be
used to hard reset the WMAC device.

The callback is required for doing a hard reset of the AR9330
chips to get them working again after a hang.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-22 16:09:57 -04:00
Gabor Juhos
3762561aa8 ath9k: add MAC revision detection for AR9330
The AR9330 1.0 and 1.1 are using the same revision,
thus it is not possible to distinguish the two chips.
The platform setup code can distinguish the chips based
on the SoC revision.

Add a callback function to ath9k_platform_data in order
to allow getting the revision number from the platform code.

Signed-off-by: Gabor Juhos <juhosg@openwrt.org>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-22 16:09:49 -04:00
Johannes Berg
670dc2833d netlink: advertise incomplete dumps
Consider the following situation:
 * a dump that would show 8 entries, four in the first
   round, and four in the second
 * between the first and second rounds, 6 entries are
   removed
 * now the second round will not show any entry, and
   even if there is a sequence/generation counter the
   application will not know

To solve this problem, add a new flag NLM_F_DUMP_INTR
to the netlink header that indicates the dump wasn't
consistent, this flag can also be set on the MSG_DONE
message that terminates the dump, and as such above
situation can be detected.

To achieve this, add a sequence counter to the netlink
callback struct. Of course, netlink code still needs
to use this new functionality. The correct way to do
that is to always set cb->seq when a dumpit callback
is invoked and call nl_dump_check_consistent() for
each new message. The core code will also call this
function for the final MSG_DONE message.

To make it usable with generic netlink, a new function
genlmsg_nlhdr() is needed to obtain the netlink header
from the genetlink user header.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: John W. Linville <linville@tuxdriver.com>
2011-06-22 16:09:45 -04:00
Tejun Heo
06d984737b ptrace: s/tracehook_tracer_task()/ptrace_parent()/
tracehook.h is on the way out.  Rename tracehook_tracer_task() to
ptrace_parent() and move it from tracehook.h to ptrace.h.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Stephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:29 +02:00
Tejun Heo
4b9d33e6d8 ptrace: kill clone/exec tracehooks
At this point, tracehooks aren't useful to mainline kernel and mostly
just add an extra layer of obfuscation.  Although they have comments,
without actual in-kernel users, it is difficult to tell what are their
assumptions and they're actually trying to achieve.  To mainline
kernel, they just aren't worth keeping around.

This patch kills the following clone and exec related tracehooks.

	tracehook_prepare_clone()
	tracehook_finish_clone()
	tracehook_report_clone()
	tracehook_report_clone_complete()
	tracehook_unsafe_exec()

The changes are mostly trivial - logic is moved to the caller and
comments are merged and adjusted appropriately.

The only exception is in check_unsafe_exec() where LSM_UNSAFE_PTRACE*
are OR'd to bprm->unsafe instead of setting it, which produces the
same result as the field is always zero on entry.  It also tests
p->ptrace instead of (p->ptrace & PT_PTRACED) for consistency, which
also gives the same result.

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:29 +02:00
Tejun Heo
a288eecce5 ptrace: kill trivial tracehooks
At this point, tracehooks aren't useful to mainline kernel and mostly
just add an extra layer of obfuscation.  Although they have comments,
without actual in-kernel users, it is difficult to tell what are their
assumptions and they're actually trying to achieve.  To mainline
kernel, they just aren't worth keeping around.

This patch kills the following trivial tracehooks.

* Ones testing whether task is ptraced.  Replace with ->ptrace test.

	tracehook_expect_breakpoints()
	tracehook_consider_ignored_signal()
	tracehook_consider_fatal_signal()

* ptrace_event() wrappers.  Call directly.

	tracehook_report_exec()
	tracehook_report_exit()
	tracehook_report_vfork_done()

* ptrace_release_task() wrapper.  Call directly.

	tracehook_finish_release_task()

* noop

	tracehook_prepare_release_task()
	tracehook_report_death()

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:28 +02:00
Tejun Heo
f3c04b934d ptrace: move SIGTRAP on exec(2) logic to ptrace_event()
Move SIGTRAP on exec(2) logic from tracehook_report_exec() to
ptrace_event().  This is part of changes to make ptrace_event()
smarter and handle ptrace event related details in one place.

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:28 +02:00
Tejun Heo
643ad8388e ptrace: introduce ptrace_event_enabled() and simplify ptrace_event() and tracehook_prepare_clone()
This patch implements ptrace_event_enabled() which tests whether a
given PTRACE_EVENT_* is enabled and use it to simplify ptrace_event()
and tracehook_prepare_clone().

PT_EVENT_FLAG() macro is added which calculates PT_TRACE_* flag from
PTRACE_EVENT_*.  This is used to define PT_TRACE_* flags and by
ptrace_event_enabled() to find the matching flag.

This is used to make ptrace_event() and tracehook_prepare_clone()
simpler.

* ptrace_event() callers were responsible for providing mask to test
  whether the event was enabled.  This patch implements
  ptrace_event_enabled() and make ptrace_event() drop @mask and
  determine whether the event is enabled from @event.  Note that
  @event is constant and this conversion doesn't add runtime overhead.

  All conversions except tracehook_report_clone_complete() are
  trivial.  tracehook_report_clone_complete() used to use 0 for @mask
  (always enabled) but now tests whether the specified event is
  enabled.  This doesn't cause any behavior difference as it's
  guaranteed that the event specified by @trace is enabled.

* tracehook_prepare_clone() now only determines which event is
  applicable and use ptrace_event_enabled() for enable test.

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:28 +02:00
Tejun Heo
d21142ece4 ptrace: kill task_ptrace()
task_ptrace(task) simply dereferences task->ptrace and isn't even used
consistently only adding confusion.  Kill it and directly access
->ptrace instead.

This doesn't introduce any behavior change.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2011-06-22 19:26:27 +02:00
Alexey Dobriyan
b7f080cfe2 net: remove mm.h inclusion from netdevice.h
Remove linux/mm.h inclusion from netdevice.h -- it's unused (I've checked manually).

To prevent mm.h inclusion via other channels also extract "enum dma_data_direction"
definition into separate header. This tiny piece is what gluing netdevice.h with mm.h
via "netdevice.h => dmaengine.h => dma-mapping.h => scatterlist.h => mm.h".
Removal of mm.h from scatterlist.h was tried and was found not feasible
on most archs, so the link was cutoff earlier.

Hope people are OK with tiny include file.

Note, that mm_types.h is still dragged in, but it is a separate story.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 19:17:20 -07:00
Linus Torvalds
2992c4bd57 Merge branch 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6
* 'bugfixes' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6:
  NFS: Fix decode_secinfo_maxsz
  NFSv4.1: Fix an off-by-one error in pnfs_generic_pg_test
  NFSv4.1: Fix some issues with pnfs_generic_pg_test
  NFSv4.1: file layout must consider pg_bsize for coalescing
  pnfs-obj: No longer needed to take an extra ref at add_device
  SUNRPC: Ensure the RPC client only quits on fatal signals
  NFSv4: Fix a readdir regression
  nfs4.1: mark layout as bad on error path in _pnfs_return_layout
  nfs4.1: prevent race that allowed use of freed layout in _pnfs_return_layout
  NFSv4.1: need to put_layout_hdr on _pnfs_return_layout error path
  NFS: (d)printks should use %zd for ssize_t arguments
  NFSv4.1: fix break condition in pnfs_find_lseg
  nfs4.1: fix several problems with _pnfs_return_layout
  NFSv4.1: allow zero fh array in filelayout decode layout
  NFSv4.1: allow nfs_fhget to succeed with mounted on fileid
  NFSv4.1: Fix a refcounting issue in the pNFS device id cache
  NFSv4.1: deprecate headerpadsz in CREATE_SESSION
  NFS41: do not update isize if inode needs layoutcommit
  NLM: Don't hang forever on NLM unlock requests
  NFS: fix umount of pnfs filesystems
2011-06-21 18:20:55 -07:00
John Fastabend
f9ae7e4b51 dcb: Add ieee_dcb_delapp() and dcb op to delete app entry
Now that we allow multiple IEEE App entries we need a way
to remove specific entries. To do this add the ieee_dcb_delapp()
routine.

Additionaly drivers may need to remove the APP entry from
their firmware tables. Add dcb ops routine to handle this.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 16:06:11 -07:00
John Fastabend
314b4778ed net: dcbnl, add multicast group for DCB
Now that dcbnl is being used in many cases by more
than a single agent it is beneficial to be notified
when some entity either driver or user space has
changed the DCB attributes.

Today applications either end up polling the interface
or relying on a user space database to maintain the DCB
state and post events. Polling is a poor solution for
obvious reasons. And relying on a user space database
has its own downside. Namely it has created strange
boot dependencies requiring the database be populated
before any applications dependent on DCB attributes
starts or the application goes into a polling loop.
Populating the database requires negotiating link
setting with the peer and can take anywhere from less
than a second up to a few seconds depending on the switch
implementation.

Perhaps more importantly if another application or an
embedded agent sets a DCB link attribute the database
has no way of knowing other than polling the kernel.
This prevents applications from responding quickly to
changes in link events which at least in the FCoE case
and probably any other protocols expecting a lossless
link may result in IO errors.

By adding a multicast group for DCB we have clean way
to disseminate kernel DCB link attributes up to user
space. Avoiding the need for user space to maintain
a coherant database and disperse events that potentially
do not reflect the current link state.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 16:06:11 -07:00
Michael Chan
f4b5ad26bc cnic, bnx2i: Add support for new devices - 57800, 57810, and 57840
And change iSCSI RQ doorbell size from 16B to 64B to match new firmware.

Signed-off-by: Michael Chan <mchan@broadcom.com>
Signed-off-by: Eddie Wai <eddie.wai@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2011-06-21 16:06:11 -07:00
Alan Stern
6d0e0e84f6 PM: Fix async resume following suspend failure
The PM core doesn't handle suspend failures correctly when it comes to
asynchronously suspended devices.  These devices are moved onto the
dpm_suspended_list as soon as the corresponding async thread is
started up, and they remain on the list even if they fail to suspend
or the sleep transition is cancelled before they get suspended.  As a
result, when the PM core unwinds the transition, it tries to resume
the devices even though they were never suspended.

This patch (as1474) fixes the problem by adding a new "is_suspended"
flag to dev_pm_info.  Devices are resumed only if the flag is set.

[rjw:
 * Moved the dev->power.is_suspended check into device_resume(),
   because we need to complete dev->power.completion and clear
   dev->power.is_prepared too for devices whose
   dev->power.is_suspended flags are unset.
 * Fixed __device_suspend() to avoid setting dev->power.is_suspended
   if async_error is different from zero.]

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
2011-06-21 23:20:20 +02:00
Alan Stern
f76b168b6f PM: Rename dev_pm_info.in_suspend to is_prepared
This patch (as1473) renames the "in_suspend" field in struct
dev_pm_info to "is_prepared", in preparation for an upcoming change.
The new name is more descriptive of what the field really means.

Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Cc: stable@kernel.org
2011-06-21 23:19:50 +02:00
Linus Torvalds
890879cfa0 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
  jbd2: Fix oops in jbd2_journal_remove_journal_head()
  jbd2: Remove obsolete parameters in the comments for some jbd2 functions
  ext4: fixed tracepoints cleanup
  ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap
  ext4: Fix max file size and logical block counting of extent format file
  ext4: correct comments for ext4_free_blocks()
2011-06-21 10:22:35 -07:00
Grant Likely
15c3597d6e dt/platform: allow device name to be overridden
Some platform code has specific requirements on the naming of devices.
This patch allows callers of of_platform_populate() to provide a
device name lookup table.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-21 11:04:10 -06:00
Grant Likely
29d4f8a497 dt: add of_platform_populate() for creating device from the device tree
of_platform_populate() is similar to of_platform_bus_probe() except
that it strictly enforces that all device nodes must have a compatible
property, and it can be used to register devices (not buses) which are
children of the root node.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-21 11:02:29 -06:00
Grant Likely
cbb49c2665 dt: Add default match table for bus ids
No need for most platforms to define their own bus table when calling
of_platform_populate().  Supply a stock one.

Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-06-21 11:02:29 -06:00
Joerg Roedel
801019d59d Merge branches 'amd/transparent-bridge' and 'core'
Conflicts:
	arch/x86/include/asm/amd_iommu_types.h
	arch/x86/kernel/amd_iommu.c

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 11:14:10 +02:00
Joerg Roedel
403f81d8ee iommu/amd: Move missing parts to drivers/iommu
A few parts of the driver were missing in drivers/iommu.
Move them there to have the complete driver in that
directory.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:31 +02:00
Ohad Ben-Cohen
166e9278a3 x86/ia64: intel-iommu: move to drivers/iommu/
This should ease finding similarities with different platforms,
with the intention of solving problems once in a generic framework
which everyone can use.

Note: to move intel-iommu.c, the declaration of pci_find_upstream_pcie_bridge()
has to move from drivers/pci/pci.h to include/linux/pci.h. This is handled
in this patch, too.

As suggested, also drop DMAR's EXPERIMENTAL tag while we're at it.

Compile-tested on x86_64.

Signed-off-by: Ohad Ben-Cohen <ohad@wizery.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
2011-06-21 10:49:30 +02:00
David S. Miller
9f6ec8d697 Merge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
Conflicts:
	drivers/net/wireless/iwlwifi/iwl-agn-rxon.c
	drivers/net/wireless/rtlwifi/pci.c
	net/netfilter/ipvs/ip_vs_core.c
2011-06-20 22:29:08 -07:00
Linus Torvalds
79568f5be0 vfs: i_state needs to be 'unsigned long' for now
Commit 13e12d14e2 ("vfs: reorganize 'struct inode' layout a bit")
moved things around a bit changed i_state to be unsigned int instead of
unsigned long.  That was to help structure layout for the 64-bit case,
and shrink 'struct inode' a bit (admittedly that only happened when
spinlock debugging was on and i_flags didn't pack with i_lock).

However, Meelis Roos reports that this results in unaligned exceptions
on sprc, and it turns out that the bit-locking primitives that we use
for the I_NEW bit want to use the bitops.  Which want 'unsigned long',
not 'unsigned int'.

We really should fix the bit locking code to not have that kind of
requirement, but that's a much bigger change.  So for now, revert that
field back to 'unsigned long' (but keep the other re-ordering changes
from the commit that caused this).

Andi points out that we have played games with this in 'struct page', so
it's solvable with other hacks too, but since right now the struct inode
size advantage only happens with some rare config options, it's not
worth fighting.

It _would_ be worth fixing the bitlocking code, though.  Especially
since there is no type safety in the bitlocking code (this never caused
any warnings, and worked fine on x86-64, because the bitlocks take a
'void *' and x86-64 doesn't care that deeply about alignment).  So it's
currently a very easy problem to trigger by mistake and never notice.

Reported-by: Meelis Roos <mroos@linux.ee>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-06-20 20:13:49 -07:00
Linus Torvalds
eda0841094 Merge branch 'for-2.6.40' of git://linux-nfs.org/~bfields/linux
* 'for-2.6.40' of git://linux-nfs.org/~bfields/linux:
  nfsd4: fix break_lease flags on nfsd open
  nfsd: link returns nfserr_delay when breaking lease
  nfsd: v4 support requires CRYPTO
  nfsd: fix dependency of nfsd on auth_rpcgss
2011-06-20 20:10:52 -07:00
Linus Torvalds
3669820650 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
  devcgroup_inode_permission: take "is it a device node" checks to inlined wrapper
  fix comment in generic_permission()
  kill obsolete comment for follow_down()
  proc_sys_permission() is OK in RCU mode
  reiserfs_permission() doesn't need to bail out in RCU mode
  proc_fd_permission() is doesn't need to bail out in RCU mode
  nilfs2_permission() doesn't need to bail out in RCU mode
  logfs doesn't need ->permission() at all
  coda_ioctl_permission() is safe in RCU mode
  cifs_permission() doesn't need to bail out in RCU mode
  bad_inode_permission() is safe from RCU mode
  ubifs: dereferencing an ERR_PTR in ubifs_mount()
2011-06-20 20:09:15 -07:00
Benny Halevy
19345cb299 NFSv4.1: file layout must consider pg_bsize for coalescing
Otherwise we end up overflowing the rpc buffer size on the receive end.

Signed-off-by: Benny Halevy <benny@tonian.com>
Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
2011-06-20 16:12:26 -04:00