Pull scheduler changes from Ingo Molnar:
"The biggest change is the cleanup/simplification of the load-balancer:
instead of the current practice of architectures twiddling scheduler
internal data structures and providing the scheduler domains in
colorfully inconsistent ways, we now have generic scheduler code in
kernel/sched/core.c:sched_init_numa() that looks at the architecture's
node_distance() parameters and (while not fully trusting it) deducts a
NUMA topology from it.
This inevitably changes balancing behavior - hopefully for the better.
There are various smaller optimizations, cleanups and fixlets as well"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched: Taint kernel with TAINT_WARN after sleep-in-atomic bug
sched: Remove stale power aware scheduling remnants and dysfunctional knobs
sched/debug: Fix printing large integers on 32-bit platforms
sched/fair: Improve the ->group_imb logic
sched/nohz: Fix rq->cpu_load[] calculations
sched/numa: Don't scale the imbalance
sched/fair: Revert sched-domain iteration breakage
sched/x86: Rewrite set_cpu_sibling_map()
sched/numa: Fix the new NUMA topology bits
sched/numa: Rewrite the CONFIG_NUMA sched domain support
sched/fair: Propagate 'struct lb_env' usage into find_busiest_group
sched/fair: Add some serialization to the sched_domain load-balance walk
sched/fair: Let minimally loaded cpu balance the group
sched: Change rq->nr_running to unsigned int
x86/numa: Check for nonsensical topologies on real hw as well
x86/numa: Hard partition cpu topology masks on node boundaries
x86/numa: Allow specifying node_distance() for numa=fake
x86/sched: Make mwait_usable() heed to "idle=" kernel parameters properly
sched: Update documentation and comments
sched_rt: Avoid unnecessary dequeue and enqueue of pushable tasks in set_cpus_allowed_rt()
Pull perf changes from Ingo Molnar:
"Lots of changes:
- (much) improved assembly annotation support in perf report, with
jump visualization, searching, navigation, visual output
improvements and more.
- kernel support for AMD IBS PMU hardware features. Notably 'perf
record -e cycles:p' and 'perf top -e cycles:p' should work without
skid now, like PEBS does on the Intel side, because it takes
advantage of IBS transparently.
- the libtracevents library: it is the first step towards unifying
tracing tooling and perf, and it also gives a tracing library for
external tools like powertop to rely on.
- infrastructure: various improvements and refactoring of the UI
modules and related code
- infrastructure: cleanup and simplification of the profiling
targets code (--uid, --pid, --tid, --cpu, --all-cpus, etc.)
- tons of robustness fixes all around
- various ftrace updates: speedups, cleanups, robustness
improvements.
- typing 'make' in tools/ will now give you a menu of projects to
build and a short help text to explain what each does.
- ... and lots of other changes I forgot to list.
The perf record make bzImage + perf report regression you reported
should be fixed."
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (166 commits)
tracing: Remove kernel_lock annotations
tracing: Fix initial buffer_size_kb state
ring-buffer: Merge separate resize loops
perf evsel: Create events initially disabled -- again
perf tools: Split term type into value type and term type
perf hists: Fix callchain ip printf format
perf target: Add uses_mmap field
ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER
ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code()
ftrace: Make ftrace_modify_all_code() global for archs to use
ftrace: Return record ip addr for ftrace_location()
ftrace: Consolidate ftrace_location() and ftrace_text_reserved()
ftrace: Speed up search by skipping pages by address
ftrace: Remove extra helper functions
ftrace: Sort all function addresses, not just per page
tracing: change CPU ring buffer state from tracing_cpumask
tracing: Check return value of tracing_dentry_percpu()
ring-buffer: Reset head page before running self test
ring-buffer: Add integrity check at end of iter read
ring-buffer: Make addition of pages in ring buffer atomic
...
Here's the driver core, and other driver subsystems, pull request for
the 3.5-rc1 merge window.
Outside of a few minor driver core changes, we ended up with the
following different subsystem and core changes as well, due to
interdependancies on the driver core:
- hyperv driver updates
- drivers/memory being created and some drivers moved into it
- extcon driver subsystem created out of the old Android staging switch
driver code
- dynamic debug updates
- printk rework, and /dev/kmsg changes
All of this has been tested in the linux-next releases for a few weeks
with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iEYEABECAAYFAk+7q28ACgkQMUfUDdst+ykXmwCfcPASzC+/bDkuqdWsqzxlWZ7+
VOQAnAriySv397St36J6Hz5bMQZwB1Yq
=SQc+
-----END PGP SIGNATURE-----
Merge tag 'driver-core-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg Kroah-Hartman:
"Here's the driver core, and other driver subsystems, pull request for
the 3.5-rc1 merge window.
Outside of a few minor driver core changes, we ended up with the
following different subsystem and core changes as well, due to
interdependancies on the driver core:
- hyperv driver updates
- drivers/memory being created and some drivers moved into it
- extcon driver subsystem created out of the old Android staging
switch driver code
- dynamic debug updates
- printk rework, and /dev/kmsg changes
All of this has been tested in the linux-next releases for a few weeks
with no reported problems.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
Fix up conflicts in drivers/extcon/extcon-max8997.c where git noticed
that a patch to the deleted drivers/misc/max8997-muic.c driver needs to
be applied to this one.
* tag 'driver-core-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (90 commits)
uio_pdrv_genirq: get irq through platform resource if not set otherwise
memory: tegra{20,30}-mc: Remove empty *_remove()
printk() - isolate KERN_CONT users from ordinary complete lines
sysfs: get rid of some lockdep false positives
Drivers: hv: util: Properly handle version negotiations.
Drivers: hv: Get rid of an unnecessary check in vmbus_prep_negotiate_resp()
memory: tegra{20,30}-mc: Use dev_err_ratelimited()
driver core: Add dev_*_ratelimited() family
Driver Core: don't oops with unregistered driver in driver_find_device()
printk() - restore prefix/timestamp printing for multi-newline strings
printk: add stub for prepend_timestamp()
ARM: tegra30: Make MC optional in Kconfig
ARM: tegra20: Make MC optional in Kconfig
ARM: tegra30: MC: Remove unnecessary BUG*()
ARM: tegra20: MC: Remove unnecessary BUG*()
printk: correctly align __log_buf
ARM: tegra30: Add Tegra Memory Controller(MC) driver
ARM: tegra20: Add Tegra Memory Controller(MC) driver
printk() - restore timestamp printing at console output
printk() - do not merge continuation lines of different threads
...
Pull security subsystem updates from James Morris:
"New notable features:
- The seccomp work from Will Drewry
- PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski
- Longer security labels for Smack from Casey Schaufler
- Additional ptrace restriction modes for Yama by Kees Cook"
Fix up trivial context conflicts in arch/x86/Kconfig and include/linux/filter.h
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (65 commits)
apparmor: fix long path failure due to disconnected path
apparmor: fix profile lookup for unconfined
ima: fix filename hint to reflect script interpreter name
KEYS: Don't check for NULL key pointer in key_validate()
Smack: allow for significantly longer Smack labels v4
gfp flags for security_inode_alloc()?
Smack: recursive tramsmute
Yama: replace capable() with ns_capable()
TOMOYO: Accept manager programs which do not start with / .
KEYS: Add invalidation support
KEYS: Do LRU discard in full keyrings
KEYS: Permit in-place link replacement in keyring list
KEYS: Perform RCU synchronisation on keys prior to key destruction
KEYS: Announce key type (un)registration
KEYS: Reorganise keys Makefile
KEYS: Move the key config into security/keys/Kconfig
KEYS: Use the compat keyctl() syscall wrapper on Sparc64 for Sparc32 compat
Yama: remove an unused variable
samples/seccomp: fix dependencies on arch macros
Yama: add additional ptrace scopes
...
Pull smp hotplug cleanups from Thomas Gleixner:
"This series is merily a cleanup of code copied around in arch/* and
not changing any of the real cpu hotplug horrors yet. I wish I'd had
something more substantial for 3.5, but I underestimated the lurking
horror..."
Fix up trivial conflicts in arch/{arm,sparc,x86}/Kconfig and
arch/sparc/include/asm/thread_info_32.h
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits)
um: Remove leftover declaration of alloc_task_struct_node()
task_allocator: Use config switches instead of magic defines
sparc: Use common threadinfo allocator
score: Use common threadinfo allocator
sh-use-common-threadinfo-allocator
mn10300: Use common threadinfo allocator
powerpc: Use common threadinfo allocator
mips: Use common threadinfo allocator
hexagon: Use common threadinfo allocator
m32r: Use common threadinfo allocator
frv: Use common threadinfo allocator
cris: Use common threadinfo allocator
x86: Use common threadinfo allocator
c6x: Use common threadinfo allocator
fork: Provide kmemcache based thread_info allocator
tile: Use common threadinfo allocator
fork: Provide weak arch_release_[task_struct|thread_info] functions
fork: Move thread info gfp flags to header
fork: Remove the weak insanity
sh: Remove cpu_idle_wait()
...
There is no point having the NET dependency on the select target, as it
forces all users to depend on NET to tell they support BPF_JIT. Move
the config option to the bottom of the file - this could be a nice place
also for future "selectable" config symbols.
Fix up all users to drop the dependency on NET now that it is not
required to supress warnings for non-NET builds.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Merge reason: We are going to queue up a dependent patch:
"perf tools: Move parse event automated tests to separated object"
That depends on:
commit e7c72d8
perf tools: Add 'G' and 'H' modifiers to event parsing
Conflicts:
tools/perf/builtin-stat.c
Conflicted with the recent 'perf_target' patches when checking the
result of perf_evsel open routines to see if a retry is needed to cope
with older kernels where the exclude guest/host perf_event_attr bits
were not used.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When handling the H_BULK_REMOVE hypercall, we were forgetting to
invalidate and unlock the hashed page table entry (HPTE) in the case
where the page had been paged out. This fixes it by clearing the
first doubleword of the HPTE in that case.
This fixes a regression introduced in commit a92bce95f0 ("KVM: PPC:
Book3S HV: Keep HPTE locked when invalidating"). The effect of the
regression is that the host kernel will sometimes hang when under
memory pressure.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
The code forgot to scramble the VSIDs the way we normally do
and was basically using the "proto VSID" directly with the MMU.
This means that in practice, KVM used random VSIDs that could
collide with segments used by other user space programs.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[agraf: simplify ppc32 case]
Signed-off-by: Alexander Graf <agraf@suse.de>
When jumping back into the kernel to code that knows that it would be
using HSRR registers instead of SRR registers, we need to make sure we
pass it all information on where to jump to in HSRR registers.
Unfortunately, we used r10 to store the information to distinguish between
the HSRR and SRR case. That register got clobbered in between though,
rendering the later comparison invalid.
Instead, let's use cr1 to store this information. That way we don't
need yet another register and everyone's happy.
This fixes PR KVM on POWER7 bare metal for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
When running on a system that is HV capable, some interrupts use HSRR
SPRs instead of the normal SRR SPRs. These are also used in the Linux
handlers to jump back to code after an interrupt got processed.
Unfortunately, in our "jump back to the real host handler after we've
done the context switch" code, we were only setting the SRR SPRs,
rendering Linux to jump back to some invalid IP after it's processed
the interrupt.
This fixes random crashes on p7 opal mode with PR KVM for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
In addition to normal "priviledged instruction" traps, we can also receive
"emulation assist" traps on newer hardware that has the HV bit set.
Handle that one the same way as a privileged instruction, including the
instruction fetching. That way we don't execute old instructions that we
happen to still leave in that field when an emul assist trap comes.
This fixes -M mac99 / -M g3beige on p7 bare metal for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
So we have another case of paca->irq_happened getting out of
sync with the HW irq state. This can happen when a perfmon
interrupt occurs while soft disabled, as it will return to a
soft disabled but hard enabled context while leaving a stale
PACA_IRQ_HARD_DIS flag set.
This patch fixes it, and also adds a test for the condition
of those flags being out of sync in arch_local_irq_restore()
when CONFIG_TRACE_IRQFLAGS is enabled.
This helps catching those gremlins faster (and so far I
can't seem see any anymore, so that's good news).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pull KVM fixes from Avi Kivity:
"Two asynchronous page fault fixes (one guest, one host), a powerpc
page refcount fix, and an ia64 build fix."
* git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: ia64: fix build due to typo
KVM: PPC: Book3S HV: Fix refcounting of hugepages
KVM: Do not take reference to mm during async #PF
KVM: ensure async PF event wakes up vcpu from halt
Pull powerpc fixes from Benjamin Herrenschmidt:
"Here are a couple of last minute fixes for 3.4 for regressions
introduced by my rewrite of the lazy irq masking code."
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
powerpc/irq: Make alignment & program interrupt behave the same
powerpc/irq: Fix bug with new lazy IRQ handling code
We always need to pass the last sample period to
perf_sample_data_init(), otherwise the event distribution will be
wrong. Thus, modifiyng the function interface with the required period
as argument. So basically a pattern like this:
perf_sample_data_init(&data, ~0ULL);
data.period = event->hw.last_period;
will now be like that:
perf_sample_data_init(&data, ~0ULL, event->hw.last_period);
Avoids unininitialized data.period and simplifies code.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333390758-10893-3-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The current code groups up to 16 nodes in a level and then puts an
ALLNODES domain spanning the entire tree on top of that. This doesn't
reflect the numa topology and esp for the smaller not-fully-connected
machines out there today this might make a difference.
Therefore, build a proper numa topology based on node_distance().
Since there's no fixed numa layers anymore, the static SD_NODE_INIT
and SD_ALLNODES_INIT aren't usable anymore, the new code tries to
construct something similar and scales some values either on the
number of cpus in the domain and/or the node_distance() ratio.
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Anton Blanchard <anton@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: linux-alpha@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-sh@vger.kernel.org
Cc: Matt Turner <mattst88@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: sparclinux@vger.kernel.org
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86@kernel.org
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Greg Pearson <greg.pearson@hp.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: bob.picco@oracle.com
Cc: chris.mason@oracle.com
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-r74n3n8hhuc2ynbrnp3vt954@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Alignment was the last user of the ENABLE_INTS macro, which we can
now remove. All non-syscall exceptions now disable interrupts on
entry, they get re-enabled conditionally from C code. Don't
unconditionally re-enable in program check either, check the
original context.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
We had a case where we could turn on hard interrupts while
leaving the PACA_IRQ_HARD_DIS bit set in the PACA. This can
in turn cause a BUG_ON() to hit in __check_irq_replay() due
to interrupt state getting out of sync.
The assembly code was also way too convoluted. Instead, we
now leave it to the C code to do the right thing which ends
up being smaller and more readable.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The H_REGISTER_VPA hcall implementation in HV Power KVM needs to pin some
guest memory pages into host memory so that they can be safely accessed
from usermode. It does this used get_user_pages_fast(). When the VPA is
unregistered, or the VCPUs are cleaned up, these pages are released using
put_page().
However, the get_user_pages() is invoked on the specific memory are of the
VPA which could lie within hugepages. In case the pinned page is huge,
we explicitly find the head page of the compound page before calling
put_page() on it.
At least with the latest kernel, this is not correct. put_page() already
handles finding the correct head page of a compound, and also deals with
various counts on the individual tail page which are important for
transparent huge pages. We don't support transparent hugepages on Power,
but even so, bypassing this count maintenance can lead (when the VM ends)
to a hugepage being released back to the pool with a non-zero mapcount on
one of the tail pages. This can then lead to a bad_page() when the page
is released from the hugepage pool.
This removes the explicit compound_head() call to correct this bug.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avi Kivity <avi@redhat.com>
The core now has a threadinfo allocator which uses a kmemcache when
THREAD_SIZE < PAGE_SIZE.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120505150142.059161130@linutronix.de
cpuidle uses a generic function now. Remove the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120507175652.330322737@linutronix.de
commit 771dae818 (powerpc/cpuidle: Add cpu_idle_wait() to allow
switching of idle routines) implemented cpu_idle_wait() for powerpc.
The changelog says:
"The equivalent routine for x86 is in arch/x86/kernel/process.c
but the powerpc implementation is different.":
Unfortunately the changelog is completely useless as it does not tell
_WHY_ it is different.
Aside of being different the implementation is patently wrong.
The rescheduling IPI is async. That means that there is no guarantee,
that the other cores have executed the IPI when cpu_idle_wait()
returns. But that's the whole purpose of this function: to guarantee
that no CPU uses the old idle handler anymore.
Use the smp_functional_call() based implementation, which fulfils the
requirements.
[ This code is going to replaced by a core version to remove all the
pointless copies in arch/*, but this one should go to stable ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Cc: Trinabh Gupta <g.trinabh@gmail.com>
Cc: Arun R Bharadwaj <arun.r.bharadwaj@gmail.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120507175651.980164748@linutronix.de
Cc: stable@vger.kernel.org
Commit 9fb48c744b
"params: add 3rd arg to option handler callback signature"
added an extra arg to the function, but didn't catch all the use
cases needing it, causing this compile fail in mpc85xx_defconfig:
arch/powerpc/mm/hugetlbpage.c:316:4: error: passing argument 7 of
'parse_args' from incompatible pointer type [-Werror]
include/linux/moduleparam.h:317:12: note: expected
'int (*)(char *, char *, const char *)' but argument is of type
'int (*)(char *, char *)'
This function has no need to printk out the "doing" value, so
just add the arg as an "unused".
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jim Cromie <jim.cromie@gmail.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Becky Bruce <beckyb@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJPnb50AAoJEHm+PkMAQRiGAE0H/A4zFZIUGmF3miKPDYmejmrZ
oVDYxVAu6JHjHWhu8E3VsinvyVscowjV8dr15eSaQzmDmRkSHAnUQ+dB7Di7jLC2
MNopxsWjwyZ8zvvr3rFR76kjbWKk/1GYytnf7GPZLbJQzd51om2V/TY/6qkwiDSX
U8Tt7ihSgHAezefqEmWp2X/1pxDCEt+VFyn9vWpkhgdfM1iuzF39MbxSZAgqDQ/9
JJrBHFXhArqJguhENwL7OdDzkYqkdzlGtS0xgeY7qio2CzSXxZXK4svT6FFGA8Za
xlAaIvzslDniv3vR2ZKd6wzUwFHuynX222hNim3QMaYdXm012M+Nn1ufKYGFxI0=
=4d4w
-----END PGP SIGNATURE-----
Merge tag 'v3.4-rc5' into next
Linux 3.4-rc5
Merge to pull in prerequisite change for Smack:
86812bb0de
Requested by Casey.
Pull networking fixes from David Miller:
1) Transfer padding was wrong for full-speed USB in ASIX driver, fix
from Ingo van Lil.
2) Propagate the negative packet offset fix into the PowerPC BPF JIT.
From Jan Seiffert.
3) dl2k driver's private ioctls were letting unprivileged tasks make
MII writes and other ugly bits like that. Fix from Jeff Mahoney.
4) Fix TX VLAN and RX packet drops in ucc_geth, from Joakim Tjernlund.
5) OOPS and network namespace fixes in IPVS from Hans Schillstrom and
Julian Anastasov.
6) Fix races and sleeping in locked context bugs in drop_monitor, from
Neil Horman.
7) Fix link status indication in smsc95xx driver, from Paolo Pisati.
8) Fix bridge netfilter OOPS, from Peter Huang.
9) L2TP sendmsg can return on error conditions with the socket lock
held, oops. Fix from Sasha Levin.
10) udp_diag should return meaningful values for socket memory usage,
from Shan Wei.
11) Eric Dumazet is so awesome he gets his own section:
Socket memory cgroup code (I never should have applied those
patches, grumble...) made erroneous changes to
sk_sockets_allocated_read_positive(). It was changed to
use percpu_counter_sum_positive (which requires BH disabling)
instead of percpu_counter_read_positive (which does not).
Revert back to avoid crashes and lockdep warnings.
Adjust the default tcp_adv_win_scale and tcp_rmem[2] values
to fix throughput regressions. This is necessary as a result
of our more precise skb->truesize tracking.
Fix SKB leak in netem packet scheduler.
12) New device IDs for various bluetooth devices, from Manoj Iyer,
AceLan Kao, and Steven Harms.
13) Fix command completion race in ipw2200, from Stanislav Yakovlev.
14) Fix rtlwifi oops on unload, from Larry Finger.
15) Fix hard_mtu when adjusting hard_header_len in smsc95xx driver.
From Stephane Fillod.
16) ehea driver registers it's IRQ before all the necessary state is
setup, resulting in crashes. Fix from Thadeu Lima de Souza
Cascardo.
17) Fix PHY connection failures in davinci_emac driver, from Anatolij
Gustschin.
18) Missing break; in switch statement in bluetooth's
hci_cmd_complete_evt(). Fix from Szymon Janc.
19) Fix queue programming in iwlwifi, from Johannes Berg.
20) Interrupt throttling defaults not being actually programmed into the
hardware, fix from Jeff Kirsher and Ying Cai.
21) TLAN driver SKB encoding in descriptor busted on 64-bit, fix from
Benjamin Poirier.
22) Fix blind status block RX producer pointer deref in TG3 driver, from
Matt Carlson.
23) Promisc and multicast are busted on ehea, fixes from Thadeu Lima de
Souza Cascardo.
24) Fix crashes in 6lowpan, from Alexander Smirnov.
25) tcp_complete_cwr() needs to be careful to not rewind the CWND to
ssthresh if ssthresh has the "infinite" value. Fix from Yuchung
Cheng.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (81 commits)
sungem: Fix WakeOnLan
tcp: change tcp_adv_win_scale and tcp_rmem[2]
net: l2tp: unlock socket lock before returning from l2tp_ip_sendmsg
drop_monitor: prevent init path from scheduling on the wrong cpu
usbnet: fix failure handling in usbnet_probe
usbnet: fix leak of transfer buffer of dev->interrupt
ucc_geth: Add 16 bytes to max TX frame for VLANs
net: ucc_geth, increase no. of HW RX descriptors
netem: fix possible skb leak
sky2: fix receive length error in mixed non-VLAN/VLAN traffic
sky2: propogate rx hash when packet is copied
net: fix two typos in skbuff.h
cxgb3: Don't call cxgb_vlan_mode until q locks are initialized
ixgbe: fix calling skb_put on nonlinear skb assertion bug
ixgbe: Fix a memory leak in IEEE DCB
igbvf: fix the bug when initializing the igbvf
smsc75xx: enable mac to detect speed/duplex from phy
smsc75xx: declare smsc75xx's MII as GMII capable
smsc75xx: fix phy interrupt acknowledge
smsc75xx: fix phy init reset loop
...
Now the helper function from filter.c for negative offsets is exported,
it can be used it in the jit to handle negative offsets.
First modify the asm load helper functions to handle:
- know positive offsets
- know negative offsets
- any offset
then the compiler can be modified to explicitly use these helper
when appropriate.
This fixes the case of a negative X register and allows to lift
the restriction that bpf programs with negative offsets can't
be jited.
Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Jan Seiffert <kaffeemonster@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently, Ryan Wang tried to compile PPC pSeries platform without
CONFIG_EEH and eventually run into errors. Nishanth Aravamudan
helped to narrow down the root cause. Actually, the pSeries platform
depends on CONFIG_EEH heavily and that won't work properly without
EEH support.
According to Ben's suggestion, the patch make CONFIG_EEH invisible
and keep it as always selected on pSeries platform.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The switch from using irq_map to irq_alloc_desc*() for managing irq
number allocations introduced new bugs in some of the powerpc
interrupt code. Several functions rely on the value of NR_IRQS to
determine the maximum irq number that could get allocated. However,
with sparse_irq and using irq_alloc_desc*() the maximum possible irq
number is now specified with 'nr_irqs' which may be a number larger
than NR_IRQS. This has caused breakage on powermac when
CONFIG_NR_IRQS is set to 32.
This patch removes most of the direct references to NR_IRQS in the
powerpc code and replaces them with either a nr_irqs reference or by
using the common for_each_irq_desc() macro. The powerpc-specific
for_each_irq() macro is removed at the same time.
Also, the Cell axon_msi driver is refactored to remove the global
build assumption on the size of NR_IRQS and instead add a limit to the
maximum irq number when calling irq_domain_add_nomap().
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The mpc8xx driver uses a reference to NR_IRQS that is buggy. It uses
NR_IRQs for the array size of the ppc_cached_irq_mask bitmap, but
NR_IRQs could be smaller than the number of hardware irqs that
ppc_cached_irq_mask tracks.
Also, while fixing that problem, it became apparent that the interrupt
controller only supports 32 interrupt numbers, but it is written as if
it supports multiple register banks which is more complicated.
This patch pulls out the buggy reference to NR_IRQs and fixes the size
of the ppc_cached_irq_mask to match the number of HW irqs. It also
drops the now-unnecessary code since ppc_cached_irq_mask is no longer
an array.
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Pull build fixes for less mainstream architectures from Paul Gortmaker:
"These are fixes for frv(1), blackfin(2), powerpc(1) and xtensa(4).
Fortunately the touches are nearly all specific to files just used by
the arch in question. The two touches to shared/common files
[kernel/irq/debug.h and drivers/pci/Makefile] are trivial to assess as
no risk to anyone.
Half of them relate to xtensa directly. It was only when I fixed the
last xtensa issue that I realized that the arch has been broken for a
significant time, and isn't a specific v3.4 regression. So if you
wanted, we could leave xtensa lying bleeding in the street for a
couple more weeks and queue those for 3.5. But given they are no risk
to anyone outside of xtensa, I figured to just leave them in.
If you are OK with taking the xtensa fixes, then please pull to get:
- one last implicit include uncovered by system.h that is in a file
specific to just one powerpc defconfig. (I'd sync'd with BenH).
- fix an oversight in the PCI makefile where shared code wasn't being
compiled for ARCH=frv
- fix a missing include for GPIO in blackfin framebuffer.
- audit and tag endif in blackfin ezkit board file, in order to find
and fix the misplaced endif masking a block of code.
- fix irq/debug.h choice of temporary macro names to be more internal
so they don't conflict with names used by xtensa.
- fix a reference to an undeclared local var in xtensa's signal.c
- fix an implicit bug.h usage in xtensa's asm/io.h uncovered by my
removing bug.h from kernel.h
- fix xtensa to properly indicate it is using asm-generic/hardirq.h
in order to resolve the link error - undefined ack_bad_irq
The xtensa still fails final link as my latest binutils does something
evil when ld forward-relocates unlikely() blocks, but in theory people
who have older/valid toolchains could now use the thing."
* 'for-v3.4-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux:
xtensa: fix build fail on undefined ack_bad_irq
blackfin: fix ifdef fustercluck in mach-bf538/boards/ezkit.c
blackfin: fix compile error in bfin-lq035q1-fb.c
pci: frv architecture needs generic setup-bus infrastructure
irq: hide debug macros so they don't collide with others.
xtensa: fix build error in xtensa/include/asm/io.h
xtensa: fix build failure in xtensa/kernel/signal.c
powerpc: fix system.h fallout in sysdev/scom.c [chroma_defconfig]
Renaming remaining PERF_COUNTERS options into PERF_EVENTS.
Think we can get rid of PERF_COUNTERS now.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1333643084-26776-5-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Link: http://lkml.kernel.org/r/20120420124557.311212868@linutronix.de
Preparatory patch to make the idle thread allocation for secondary
cpus generic.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: x86@kernel.org
Link: http://lkml.kernel.org/r/20120420124556.964170564@linutronix.de
This gets rid of the unused default senses array, and replaces the
incorrect use of IRQ_TYPE_NONE with the new IRQ_TYPE_DEFAULT for
the initial set_trigger() call when mapping an interrupt.
This in turn makes us read the HW state and update the irq desc
accordingly.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
mpic_is_ipi() takes a virq and immediately converts it to a hw_irq.
However, one of the two call sites calls it with a ... hw_irq. The
other call site also happens to have the hw_irq at hand, so let's
change it to just take that as an argument. Also change mpic_is_tm()
for consistency.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
If the interrupt and the timeout happen roughly at the same
time, we can get into a situation where the timer function
is run while the interrupt has already been processed. In
this case, the timer function might end up doing an add_timer
on an already pending timer, causing a BUG_ON() to trigger.
Instead, just skip the whole timeout operation if we see that
the timer is pending. The spinlock ensures that the only way
that happens is if we already started a new operation and thus
the timeout can be ignored.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The problem was reported by Anton Blanchard. While EEH error
happened to the PCI device without the corresponding device
driver, kernel crash was seen. Eventually, I successfully
reproduced the problem on Firebird-L machine with utility
"errinjct". Initially, the device driver for Emulex ethernet
MAC has been disabled from .config and force data parity on
the Emulex ethernet MAC with help of "errinjct". Eventually,
I saw the kernel crash after issueing couple of "lspci -v"
command.
The root cause behind is that the PCI device, including the
reference to the corresponding eeh device, will be removed
from the system while EEH does recovery. Afterwards, the
PCI device will be probed again and added into the system
accordingly. So it's not safe to retrieve the eeh device from
the corresponding PCI device after the PCI device has been removed
and not added again.
The patch fixes the issue and retrieve the eeh device from OF node
instead of PCI device after the PCI device has been removed.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Also fix issue of accessing invalid msgr pointer issue. The local
msgr pointer in fucntion mpic_msgr_get will be accessed before
getting a valid address which will cause kernel crash.
Signed-off-by: Mingkai Hu <Mingkai.hu@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
In file included from arch/powerpc/sysdev/mpic_msgr.c:20:0:
~/arch/powerpc/include/asm/mpic_msgr.h: In function 'mpic_msgr_set_destination':
~/arch/powerpc/include/asm/mpic_msgr.h:117:2:
error: implicit declaration of function 'get_hard_smp_processor_id'
make[1]: *** [arch/powerpc/sysdev/mpic_msgr.o] Error 1
Signed-off-by: Mingkai Hu <Mingkai.hu@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Commit ae3a197e (Disintegrate asm/system.h for PowerPC) broke build of
assembly files when CONFIG_BOOKE_WDT is enabled as follows:
AS arch/powerpc/lib/string.o
/home/baruch/git/stable/arch/powerpc/include/asm/reg_booke.h: Assembler messages:
/home/baruch/git/stable/arch/powerpc/include/asm/reg_booke.h:19: Error: Unrecognized opcode: `extern'
/home/baruch/git/stable/arch/powerpc/include/asm/reg_booke.h:20: Error: Unrecognized opcode: `extern'
Since setup_32.c is the only user of the booke_wdt configuration variables, move
the declarations there.
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
Commit 46d026ac ("powerpc/85xx: consolidate of_platform_bus_probe calls")
replaced platform-specific of_device_id tables with a single function
that probes the most of the busses in 85xx device trees. If a specific
platform needed additional busses probed, then it could call
of_platform_bus_probe() again. Typically, the additional platform-specific
busses are children of existing busses that have already been probed.
of_platform_bus_probe() does not handle those child busses automatically.
Unfortunately, this doesn't actually work. The second (platform-specific)
call to of_platform_bus_probe() never finds any of the busses it's asked
to find.
To remedy this, the platform-specific of_device_id tables are eliminated,
and their entries are merged into mpc85xx_common_ids[], so that all busses
are probed at once.
Signed-off-by: Timur Tabi <timur@freescale.com>
Signed-off-by: Kumar Gala <galak@kernel.crashing.org>
The following shows up in chroma_defconfig:
CC arch/powerpc/sysdev/scom.o
arch/powerpc/sysdev/scom.c: In function 'scom_debug_init':
arch/powerpc/sysdev/scom.c:182:36: error: 'powerpc_debugfs_root' undeclared (first use in this function)
arch/powerpc/sysdev/scom.c:182:36: note: each undeclared identifier is reported only once for each function it appears in
make[2]: *** [arch/powerpc/sysdev/scom.o] Error 1
make[1]: *** [arch/powerpc/sysdev/scom.o] Error 2
A bisect leads to commit 9ffc93f203
"Remove all #inclusions of asm/system.h"
Add the debug header which contains powerpc_debugfs_root.
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
This change is inspired by
https://lkml.org/lkml/2012/4/16/14
which fixes the build warnings for arches that don't support
CONFIG_HAVE_ARCH_SECCOMP_FILTER.
In particular, there is no requirement for the return value of
secure_computing() to be checked unless the architecture supports
seccomp filter. Instead of silencing the warnings with (void)
a new static inline is added to encode the expected behavior
in a compiler and human friendly way.
v2: - cleans things up with a static inline
- removes sfr's signed-off-by since it is a different approach
v1: - matches sfr's original change
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Will Drewry <wad@chromium.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>