This patch implements code to handle ibs interrupts. If ibs data
is available a raw perf_event data sample is created and sent
back to the userland. This patch only implements the storage of
ibs data in the raw sample, but this could be extended in a
later patch by generating generic event data such as the rip
from the ibs sampling data.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323968199-9326-3-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
This patch implements perf configuration for AMD IBS. The IBS
pmu is selected using the type attribute in sysfs. There are two
types of ibs pmus, for instruction fetch (IBS_FETCH) and for
instruction execution (IBS_OP):
/sys/bus/event_source/devices/ibs_fetch/type
/sys/bus/event_source/devices/ibs_op/type
Except for the sample period IBS can only be set up with raw
config values and raw data samples. The event attributes for the
syscall should be programmed like this (IBS_FETCH):
type = get_pmu_type("/sys/bus/event_source/devices/ibs_fetch/type");
memset(&attr, 0, sizeof(attr));
attr.type = type;
attr.sample_type = PERF_SAMPLE_CPU | PERF_SAMPLE_RAW;
attr.config = IBS_FETCH_CONFIG_DEFAULT;
This implementation does not yet support 64 bit counters. It is
limited to the hardware counter bit width which is 20 bits. 64
bit support can be added later.
Signed-off-by: Robert Richter <robert.richter@amd.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1323968199-9326-2-git-send-email-robert.richter@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Fix a bug in kprobes which can modify kernel code
permanently at run-time. In the result, kernel can
crash when it executes the modified code.
This bug can happen when we put two probes enough near
and the first probe is optimized. When the second probe
is set up, it copies a byte which is already modified
by the first probe, and executes it when the probe is hit.
Even worse, the first probe and the second probe are removed
respectively, the second probe writes back the copied
(modified) instruction.
To fix this bug, kprobes always recovers the original
code and copies the first byte from recovered instruction.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: systemtap@sourceware.org
Cc: anderson@redhat.com
Link: http://lkml.kernel.org/r/20120305133215.5982.31991.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Current probed-instruction recovery expects that only breakpoint
instruction modifies instruction. However, since kprobes jump
optimization can replace original instructions with a jump,
that expectation is not enough. And it may cause instruction
decoding failure on the function where an optimized probe
already exists.
This bug can reproduce easily as below:
1) find a target function address (any kprobe-able function is OK)
$ grep __secure_computing /proc/kallsyms
ffffffff810c19d0 T __secure_computing
2) decode the function
$ objdump -d vmlinux --start-address=0xffffffff810c19d0 --stop-address=0xffffffff810c19eb
vmlinux: file format elf64-x86-64
Disassembly of section .text:
ffffffff810c19d0 <__secure_computing>:
ffffffff810c19d0: 55 push %rbp
ffffffff810c19d1: 48 89 e5 mov %rsp,%rbp
ffffffff810c19d4: e8 67 8f 72 00 callq
ffffffff817ea940 <mcount>
ffffffff810c19d9: 65 48 8b 04 25 40 b8 mov %gs:0xb840,%rax
ffffffff810c19e0: 00 00
ffffffff810c19e2: 83 b8 88 05 00 00 01 cmpl $0x1,0x588(%rax)
ffffffff810c19e9: 74 05 je ffffffff810c19f0 <__secure_computing+0x20>
3) put a kprobe-event at an optimize-able place, where no
call/jump places within the 5 bytes.
$ su -
# cd /sys/kernel/debug/tracing
# echo p __secure_computing+0x9 > kprobe_events
4) enable it and check it is optimized.
# echo 1 > events/kprobes/p___secure_computing_9/enable
# cat ../kprobes/list
ffffffff810c19d9 k __secure_computing+0x9 [OPTIMIZED]
5) put another kprobe on an instruction after previous probe in
the same function.
# echo p __secure_computing+0x12 >> kprobe_events
bash: echo: write error: Invalid argument
# dmesg | tail -n 1
[ 1666.500016] Probing address(0xffffffff810c19e2) is not an instruction boundary.
6) however, if the kprobes optimization is disabled, it works.
# echo 0 > /proc/sys/debug/kprobes-optimization
# cat ../kprobes/list
ffffffff810c19d9 k __secure_computing+0x9
# echo p __secure_computing+0x12 >> kprobe_events
(no error)
This is because kprobes doesn't recover the instruction
which is overwritten with a relative jump by another kprobe
when finding instruction boundary.
It only recovers the breakpoint instruction.
This patch fixes kprobes to recover such instructions.
With this fix:
# echo p __secure_computing+0x9 > kprobe_events
# echo 1 > events/kprobes/p___secure_computing_9/enable
# cat ../kprobes/list
ffffffff810c1aa9 k __secure_computing+0x9 [OPTIMIZED]
# echo p __secure_computing+0x12 >> kprobe_events
# cat ../kprobes/list
ffffffff810c1aa9 k __secure_computing+0x9 [OPTIMIZED]
ffffffff810c1ab2 k __secure_computing+0x12 [DISABLED]
Changes in v4:
- Fix a bug to ensure optimized probe is really optimized
by jump.
- Remove kprobe_optready() dependency.
- Cleanup code for preparing optprobe separation.
Changes in v3:
- Fix a build error when CONFIG_OPTPROBE=n. (Thanks, Ingo!)
To fix the error, split optprobe instruction recovering
path from kprobes path.
- Cleanup comments/styles.
Changes in v2:
- Fix a bug to recover original instruction address in
RIP-relative instruction fixup.
- Moved on tip/master.
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: yrl.pp-manager.tt@hitachi.com
Cc: systemtap@sourceware.org
Cc: anderson@redhat.com
Link: http://lkml.kernel.org/r/20120305133209.5982.36568.stgit@localhost.localdomain
Signed-off-by: Ingo Molnar <mingo@elte.hu>
On tui annotation, the title was set to name of the target symbol if
user selects the target. However it remained after returning to original
symbol from the target. Fix it.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1329986784-4916-4-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Accepting upper case character only is unconvenient since it requires
SHIFT key too. Why not change to it accept a simple key stroke?
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1329986784-4916-3-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Print unselected asm code lines as blue. This is what we do now for
--stdio.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1329986784-4916-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There are some variable arguments can be specified on make invocation,
but some of them are missing descriptions so that user cannot be
informed easily. Fix it.
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1329980894-4289-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If perf_evsel__open() failed, the errno was set and returned properly.
However since the perf_evlist__open() called close() on fd's for all of
evsel x cpu x thread after the failure, the errno was overridden by
other code (EBADF). So the caller of the function ended up seeing
different error message and getting confused.
Fit it by restoring original return value. Because one of caller of the
function is in the python extension, and it uses system errno
internally, it'd be better restoring the original value rather than
using the return value of the function directly, IMHO (i.e. I'm not a
python expert :)
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1329966816-23175-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
the sampling tools (top, record) are back working on AMD systems.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJPUmaEAAoJENZQFvNTUqpA12IP/1JyMC7jtS6PUTKYOnTfwe51
hTdFBUCAquj2cFJNWOe9AidBe1eOUFMgdNBW3hEhvZA3arOJVO+eAyUb72y24yQm
mbn1GMIYEj3MR+nESo7HqQ4J9byE1ngNp8IYZk3wmTit+6k8XaF2dN4o+3ZPdY6S
zkOT4bjzG8C4Y614S/qXdulS42EJUG/oyhB0Fhom2LEFt+WCCB/b0Z2XIZuh2dEt
E1oEzy5wcQZMhIFpl4LWrS/zPJ8+NBAJdBmuNvHoF5IFZxIqcK4/vUWxMVVzMc3f
xVO6rzzcpENHF/qVXztDzeIfVcnMIKBY2cEzsZI17e+gH+Mdda0ByQXwbbG4ccxF
DIxy6gwQkUpTX8tvxv1E2LTFl7qAEQ+2Ryv3yY54jjJ/UwayesEFZk0yPnnOauV6
3wHKqVmyz0We+0zsh5C7UWUiA2DI+dR1xxONd7c1FoXNRpFxCwd1DGBhG5sWTjOi
9Loh0Zq2SyD1r5f6nDbdKw1e9LaK25DLZrlkQUcGyB+sZcyQL4oYDAZK9dJHdUNj
fzaQb7e+3fFZP0hcVhUy17tTWR67pg6BGljrerVD75AKZjPpsGCj4t36yv80y+fK
yIgLyTcbGHFubqc2EvxJ9BMhSKjla/s5sqvsrQv/5OYPNrtBVaHBU5Xg+xbF+ni0
hG0MiE02CAxCau6Yxh56
=sbHZ
-----END PGP SIGNATURE-----
Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent
Cherry picked fixes from perf/core, together with the kernel fix (1018faa),
the sampling tools (top, record) are back working on AMD systems.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Just fall back to resetting those fields, if set, warning the user that
that feature is not available.
If guest samples appear they will just be discarded because no struct
machine will be found and thus the event will be accounted as not
handled and dropped, see 0c09571.
Reported-by: Namhyung Kim <namhyung@gmail.com>
Tested-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-vuwxig36mzprl5n7nzvnxxsh@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Setting perf_guest to true by default makes no sense because the perf
subcommands can not setup guest symbol information and thus not process
and guest samples. The only exception is perf-kvm which changes the
perf_guest value on its own. So change the default for perf_guest back
to false.
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1328893505-4115-3-git-send-email-joerg.roedel@amd.com
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A recent refactoring of perf-record introduced the following:
perf record -a -B
Couldn't generating buildids. Use --no-buildid to profile anyway.
sleep: Terminated
I believe the triple negative was meant to be only a double negative.
:-) While I'm there, fixed the grammar on the error message.
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1328567272-13190-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It turned out that a performance counter on AMD does not
count at all when the GO or HO bit is set in the control
register and SVM is disabled in EFER.
This patch works around this issue by masking out the HO bit
in the performance counter control register when SVM is not
enabled.
The GO bit is not touched because it is only set when the
user wants to count in guest-mode only. So when SVM is
disabled the counter should not run at all and the
not-counting is the intended behaviour.
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Avi Kivity <avi@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: stable@vger.kernel.org # v3.2
Link: http://lkml.kernel.org/r/1330523852-19566-1-git-send-email-joerg.roedel@amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
The 'perf probe' command allows kprobe to be inserted at any offset from
a function start, which results in adding kprobes to unintended
location. (example: perf probe do_fork+10000 is allowed even though
size of do_fork is ~904).
My previous patch https://lkml.org/lkml/2012/2/24/42 addressed the case
where DWARF info was available for the kernel. This patch fixes the
case where perf probe is used on a kernel without debuginfo available.
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/4F4C544D.1010909@linux.vnet.ibm.com
Signed-off-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If threads in a multi-threaded process have names shorter than the main
thread the comm for the named threads is not properly terminated.
E.g., for the process 'namedthreads' where each thread is named noploop%d
where %d is the thread number:
Before:
perf script -f comm,tid,ip,sym,dso
noploop:4ads 21616 400a49 noploop (/tmp/namedthreads)
The 'ads' in the thread comm bleeds over from the process name.
After:
perf script -f comm,tid,ip,sym,dso
noploop:4 21616 400a49 noploop (/tmp/namedthreads)
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1330111898-68071-1-git-send-email-dsahern@gmail.com
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf probe command allows kprobe to be inserted at any offset from a
function start, which results in adding kprobes to unintended location.
Example: perf probe do_fork+10000 is allowed even though size of do_fork
is ~904.
This patch will ensure probe addition fails when the offset specified is
greater than size of the function.
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jason Baron <jbaron@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Link: http://lkml.kernel.org/r/4F473F33.4060409@linux.vnet.ibm.com
Signed-off-by: Prashanth Nageshappa <prashanth@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On old kernels that don't support sample_id_all feature,
perf_evlist__id2evsel() returns NULL for non-sampling events.
This breaks perf top when multiple events are given on command line. Fix
it by using first evsel in the evlist. This will also prevent getting
the same (potential) problem in such new tool/ old kernel combo.
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1329702447-25045-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the jump label enabled case, calling static_key_enabled()
results in a function call. The function returns the results of
a compare, so it really doesn't need the overhead of a full
function call. Let's make it 'static inline' for both the jump
label enabled and disabled cases.
Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: a.p.zijlstra@chello.nl
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@polymtl.ca
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/201202281849.q1SIn1p2023270@int-mx10.intmail.prod.int.phx2.redhat.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
If kzalloc() for TYPE_DATA failed on a given cpu, previous chunk
of TYPE_INST will be leaked. Fix it.
Thanks to Peter Zijlstra for suggesting this better solution. It
should work as long as the initial value of the region is all
0's and that's the case of static (per-cpu) memory allocation.
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Link: http://lkml.kernel.org/r/1330391978-28070-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Here are some trivial NTFS changes (a spelling fix and two use before
NULL check cases found by Coverity as well as an update in MAINTAINERS
for the path to the ntfs git repo) together with a simple LDM fix for
parsing fragmented VBLKs.
* git://git.kernel.org/pub/scm/linux/kernel/git/aia21/ntfs:
NTFS: Update git repo path in MAINTAINERS file.
LDM: Fix reassembly of extended VBLKs.
NTFS: Correct two spelling errors "dealocate" to "deallocate" in mft.c.
NTFS: Do not dereference pointer before checking for NULL.
NTFS: Remove unused variable.
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mce/AMD: Fix UP build error
x86: Specify a size for the cmp in the NMI handler
x86/nmi: Test saved %cs in NMI to determine nested NMI case
x86/amd: Fix L1i and L2 cache sharing information for AMD family 15h processors
x86/microcode: Remove noisy AMD microcode warning
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
genirq: Handle pending irqs in irq_startup()
genirq: Unmask oneshot irqs when thread was not woken
The new is_compat_task() define for the !COMPAT case in
include/linux/compat.h conflicts with a similar define in
arch/s390/include/asm/compat.h.
This is the minimal patch which fixes the build issues.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
converted back to WB but end up being recycled in the general memory
pool as WC.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJPStrRAAoJEFjIrFwIi8fJovAH/RBUJdeDw8x5ki2yDhAz/80S
+yZKiGaaUYYCB0Fo/BIwVhBQeDabGz8rJCdOv40tRpRCiRD7JIfMo5tCS6QIFF7P
UvhVuJcqltxIoRjz7nGX8iSUl48JKy9vqmqWXIucG3rYQ7YOkadwVTbhsg4a9U6P
fcqexzUuXb4fr6CNBBpL3LqHfDaKNovgESHlAmzrcaRGbOADp9LVlWkR6kwiTnIA
e5yU/DEW9Ej6wJM90Mx9Rg3y22hBZEL1p5NJjaiMrOY2LzX7bE4+mTgtk+a4FNGD
8WJZm/WWhdsWrKlj8vCKOuJkIgQYJURVMySEGdzM91P1FpJ3edJxIM3qlA958vc=
=jggO
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Two fixes to fix a memory corruption bug when WC pages never get
converted back to WB but end up being recycled in the general memory
pool as WC.
There is a better way of fixing this, but there is not enough time to do
the full benchmarking to pick one of the right options - so picking the
one that favors stability for right now.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* tag 'stable/for-linus-fixes-3.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pat: Disable PAT support for now.
xen/setup: Remove redundant filtering of PTE masks.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJPSsdkAAoJENkgDmzRrbjxHQUP/RxiMxjwoX4Q/Fzrapg1fkag
0ZhZ2N9w/6AwIGF/IAqiX4BRYJueI3+LJp4W7kwKQTBTn9L1DEvw1i6htBhci25P
uW6ki1fJZh9ewT4FbMAkA7uapqloijfQoFEt0dsWlSMgzh+RqfVNDIhILsKaYCey
uFnRWrPPyB1vW0ofiB7j4LrEuRnEsitsg42Zqdpq4mX4t/cPfP4uZvvB0za6IxLV
MhPSyJ2cuuEtdIs/6DhrftDImu0k7Tbd3Jvsf7DmkF0wZOV5mCWRuroRq8kJWvzP
oJhv4DhHL8uLkqj3ljqkrKaSqQ+e1OVE0dq8bbA/3WRK8L1V94C53bTYcGuYnByN
reIXx5l4GeqDTi+SBMug3bXtrakv3pq92cBVmc4ERwtBPVObGU6Tljgdke05MF8j
i7Kg1bUYX3v3zyj5mVoCflpc5OXUcwA4CFn5mV6zHNNFvEbWQaQLnDTXDGihgjgd
Xl8W58V4yOqbXWpjnT09g934Vt1GuFrhraulbeo128bPUyg+CpMCh5cMdJIfSFxg
Rti7xLeXHm6D+7B5VpvA574k6Ah9HtCkbLSy/W0Wxa/3ckPHW2QYHWM3Sq1kBz2B
TH6sXLgJor1P8M64tiZsoF/VMvdWmBth3ZwOU1ggVZvtMGkqDkxqkkQxw7Cql579
o1NSSN/cyWlrcymWY02h
=Rmyt
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://github.com/rustyrussell/linux
* tag 'for-linus' of git://github.com/rustyrussell/linux:
mod/file2alias: make modpost compile on darwin again
commit e49ce14150 breaks cross compiling
the linux kernel on darwin hosts.
This fix introduce some minimal glue to adopt linker section handling
for darwin hosts.
Signed-off-by: Andreas Bießmann <andreas@biessmann.de>
CC: Rusty Russell <rusty@rustcorp.com.au>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: Jochen Friedrich <jochen@scram.de>
CC: Samuel Ortiz <sameo@linux.intel.com>
CC: "K. Y. Srinivasan" <kys@microsoft.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Tested-by: Bernhard Walle <bernhard@bwalle.de>
1) ICMP sockets leave err uninitialized but we try to return it for the
unsupported MSG_OOB case, reported by Dave Jones.
2) Add new Zaurus device ID entries, from Dave Jones.
3) Pointer calculation in hso driver memset is wrong, from Dan
Carpenter.
4) ks8851_probe() checks unsigned value as negative, fix also from Dan
Carpenter.
5) Fix crashes in atl1c driver due to TX queue handling, from Eric
Dumazet. I anticipate some TX side locking fixes coming in the near
future for this driver as well.
6) The inline directive fix in Bluetooth which was breaking the build
only with very new versions of GCC, from Johan Hedberg.
7) Fix crashes in the ATP CLIP code due to ARP cleanups this merge
window, reported by Meelis Roos and fixed by Eric Dumazet.
8) JME driver doesn't flush RX FIFO correctly, from Guo-Fu Tseng.
9) Some ip6_route_output() callers test the return value for NULL, but
this never happens as the convention is to return a dst entry with
dst->error set. Fixes from RonQing Li.
10) Logitech Harmony 900 should be handled by zaurus driver not
cdc_ether, update white lists and black lists accordingly. From
Scott Talbert.
11) Receiving from certain kinds of devices there won't be a MAC header,
so there is no MAC header to fixup in the IPSEC code, and if we try
to do it we'll crash. Fix from Eric Dumazet.
12) Port type array indexing off-by-one in mlx4 driver, fix from Yevgeny
Petrilin.
13) Fix regression in link-down handling in davinci_emac which causes
all RX descriptors to be freed up and therefore RX to wedge
completely, from Christian Riesch.
14) It took two attempts, but ctnetlink soft lockups seem to be
cured now, from Pablo Neira Ayuso.
15) Endianness bug fix in ENIC driver, from Santosh Nayak.
16) The long ago conversion of the PPP fragmentation code over to
abstracted SKB list handling wasn't perfect, once we get an
out of sequence SKB we don't flush the rest of them like we
should. From Ben McKeegan.
17) Fix regression of ->ip_summed initialization in sfc driver.
From Ben Hutchings.
18) Bluetooth timeout mistakenly using msecs instead of jiffies,
from Andrzej Kaczmarek.
19) Using _sync variant of work cancellation results in deadlocks,
use the non _sync variants instead. From Andre Guedes.
20) Bluetooth rfcomm code had reference counting problems leading
to crashes, fix from Octavian Purdila.
21) The conversion of netem over to classful qdisc handling added
two bugs to netem_dequeue(), fixes from Eric Dumazet.
22) Missing pci_iounmap() in ATM Solos driver. Fix from Julia Lawall.
23) b44_pci_exit() should not have __exit tag since it's invoked from
non-__exit code. From Nikola Pajkovsky.
24) The conversion of the neighbour hash tables over to RCU added a
race, fixed here by adding the necessary reread of tbl->nht, fix
from Michel Machado.
25) When we added VF (virtual function) attributes for network device
dumps, this potentially bloats up the size of the dump of one
network device such that the dump size is too large for the buffer
allocated by properly written netlink applications.
In particular, if you add 255 VFs to a network device, parts of
GLIBC stop working.
To fix this, we add an attribute that is used to turn on these
extended portions of the network device dump. Sophisticaed
applications like 'ip' that want to see this stuff will be changed
to set the attribute, whereas things like GLIBC that don't care
about VFs simply will not, and therefore won't be busted by the
mere presence of VFs on a network device.
Thanks to the tireless work of Greg Rose on this fix.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (53 commits)
sfc: Fix assignment of ip_summed for pre-allocated skbs
ppp: fix 'ppp_mp_reconstruct bad seq' errors
enic: Fix endianness bug.
gre: fix spelling in comments
netfilter: ctnetlink: fix soft lockup when netlink adds new entries (v2)
Revert "netfilter: ctnetlink: fix soft lockup when netlink adds new entries"
davinci_emac: Do not free all rx dma descriptors during init
mlx4_core: Fixing array indexes when setting port types
phy: IC+101G and PHY_HAS_INTERRUPT flag
netdev/phy/icplus: Correct broken phy_init code
ipsec: be careful of non existing mac headers
Move Logitech Harmony 900 from cdc_ether to zaurus
hso: memsetting wrong data in hso_get_count()
netfilter: ip6_route_output() never returns NULL.
ethernet/broadcom: ip6_route_output() never returns NULL.
ipv6: ip6_route_output() never returns NULL.
jme: Fix FIFO flush issue
atm: clip: remove clip_tbl
ipv4: ping: Fix recvmsg MSG_OOB error handling.
rtnetlink: Fix problem with buffer allocation
...
The autofs compat handling fix caused a compile failure when
CONFIG_COMPAT isn't defined.
Instead of adding random #ifdef'fery in autofs, let's just make the
compat helpers earlier to use: without CONFIG_COMPAT, is_compat_task()
just hardcodes to zero.
We could probably do something similar for a number of other cases where
we have #ifdef's in code, but this is the low-hanging fruit.
Reported-and-tested-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
iQIcBAABAgAGBQJPR9QMAAoJEMsfJm/On5mBWbgP+wcJFIpqh84EyFGQuhSt61su
MQYcF8fWexSBzYjK5TB0nOougY/DQfTIBrqGf00Yr0RM81W3DlfbFNfl9tlBSwee
SEMBIm8qXALjWGb0eL3gy+Y8U10I2Pe0AvAvbiw3rjuJadqCLKljIQjQgwsuY0zY
yHwmeYgLTHVQfH8pk1n2QCFviqC4Lfd4Ltf/9EdQjm3WMxbJdLfHFnb99JvUt9Cu
b/NlXjeECZFLmX4P7JltdcfixbDUls9bPioFoldR8g5k1t/nCpnuUqPyj75zjQFk
D3WwzdA+YzH2nvoj0c+RjFBXYoHG7M/DFkKAy3ZF2cuRvxWP8QDK+WBbbOfA3HBF
YeefJp7hThRZVTkLCW+rG93PcUAIbOyX2tEk7DuaVlM9MyMjE/2SMqVw+AO6PBDS
4qXgfBrFmpVB6rapChD2M8RJsjbV5rw6eI3jM3+e2MJWDrCKgi13KHaCsUWTRFXI
l429vy809sdAx+8dTJK4KOIytFXDrrsw6tMPs9JQvcbLr2Bpn/4/jyUSyrZ0a7J4
ovfQtFk7y9NPHgjQmOswZm2yJPLwEJbuMUcwK3iIpelX8Y51ltMuVaQymKTeJAwi
+elRGK24q8C8wmRKT3mul4qSTmdvC+8nsAMD4/ozKsgDyO4oc9SYhf46mjCXzyn2
km0owCCnyoFWzS+w9lUJ
=YJz+
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Couple of minor driver fixes.
* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (max34440) Fix resetting temperature history
hwmon: (f75375s) Fix register write order when setting fans to full speed
hwmon: (ads1015) Fix file leak in probe function
hwmon: (max6639) Fix PPR register initialization to set both channels
hwmon: (max6639) Fix FAN_FROM_REG calculation
three kbuild fixes for 3.3:
- make deb-pkg symlink race fix.
- make coccicheck fix.
- Dropping the check for modutils. This is not a regression, but
allows the module-init-tools replacement kmod work with the 3.3
kernel.
* 'rc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
coccicheck: change handling of C={1,2} when M= is set
builddeb: Don't create files in /tmp with predictable names
kbuild: do not check for ancient modutils tools
When the autofs protocol version 5 packet type was added in commit
5c0a32fc2c ("autofs4: add new packet type for v5 communications"), it
obvously tried quite hard to be word-size agnostic, and uses explicitly
sized fields that are all correctly aligned.
However, with the final "char name[NAME_MAX+1]" array at the end, the
actual size of the structure ends up being not very well defined:
because the struct isn't marked 'packed', doing a "sizeof()" on it will
align the size of the struct up to the biggest alignment of the members
it has.
And despite all the members being the same, the alignment of them is
different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
alignment on x86-64. And while 'NAME_MAX+1' ends up being a nice round
number (256), the name[] array starts out a 4-byte aligned.
End result: the "packed" size of the structure is 300 bytes: 4-byte, but
not 8-byte aligned.
As a result, despite all the fields being in the same place on all
architectures, sizeof() will round up that size to 304 bytes on
architectures that have 8-byte alignment for u64.
Note that this is *not* a problem for 32-bit compat mode on POWER, since
there __u64 is 8-byte aligned even in 32-bit mode. But on x86, 32-bit
and 64-bit alignment is different for 64-bit entities, and as a result
the structure that has exactly the same layout has different sizes.
So on x86-64, but no other architecture, we will just subtract 4 from
the size of the structure when running in a compat task. That way we
will write the properly sized packet that user mode expects.
Not pretty. Sadly, this very subtle, and unnecessary, size difference
has been encoded in user space that wants to read packets of *exactly*
the right size, and will refuse to touch anything else.
Reported-and-tested-by: Thomas Meyer <thomas@m3y3r.de>
Signed-off-by: Ian Kent <raven@themaw.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When pre-allocating skbs for received packets, we set ip_summed =
CHECKSUM_UNNCESSARY. We used to change it back to CHECKSUM_NONE when
the received packet had an incorrect checksum or unhandled protocol.
Commit bc8acf2c8c ('drivers/net: avoid
some skb->ip_summed initializations') mistakenly replaced the latter
assignment with a DEBUG-only assertion that ip_summed ==
CHECKSUM_NONE. This assertion is always false, but it seems no-one
has exercised this code path in a DEBUG build.
Fix this by moving our assignment of CHECKSUM_UNNECESSARY into
efx_rx_packet_gro().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
iQEcBAABAgAGBQJPSByoAAoJEDeqqVYsXL0M3CAIAJT740lwOyc9hyIBarZiZcZj
Ib9rPPppThKYvbo8w6Q6xITNTocohSnmPnjbfXZzN4nLrp1Xbzi6A4YLSeY3oxwE
t11LOMnXYPgCOCNZA3iJ4WadVbfs4Id6PWWZPnifWl6rZ2mhvtWmkCNzayY0Kv2t
WuX0j8ds0KgDG6xpfXKoXvHeNuEDJ5aZF/gtI1kmo1eilwPjlovCjsEWetHr/FQA
0jIKFdgf/nZ1ENZU0ztqGd/Q3er6t7G9qS7cFxUa4fWsqG+8Kl+KIk2PHDLL1QHu
tlYtaGm5kbh5d2tfzAD4HJZqJRw2LQ6U1gqofoAKS7JbqYPUJZDoERQtWZUpoj4=
=GZJg
-----END PGP SIGNATURE-----
Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6
SCSI fixes on 20120224:
"This is a set of assorted bug fixes for power management, mpt2sas,
ipr, the rdac device handler and quite a big chunk for qla2xxx (plus a
use after free of scsi_host in scsi_scan.c). "
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-rc-fixes-2.6:
[SCSI] scsi_dh_rdac: Fix for unbalanced reference count
[SCSI] scsi_pm: Fix bug in the SCSI power management handler
[SCSI] scsi_scan: Fix 'Poison overwritten' warning caused by using freed 'shost'
[SCSI] qla2xxx: Update version number to 8.03.07.13-k.
[SCSI] qla2xxx: Proper detection of firmware abort error code for ISP82xx.
[SCSI] qla2xxx: Remove resetting memory during device initialization for ISP82xx.
[SCSI] qla2xxx: Complete mailbox command timedout to avoid initialization failures during next reset cycle.
[SCSI] qla2xxx: Remove check for null fcport from host reset handler.
[SCSI] qla2xxx: Correct out of bounds read of ISP2200 mailbox registers.
[SCSI] qla2xxx: Remove errant clearing of MBX_INTERRUPT flag during CT-IOCB processing.
[SCSI] qla2xxx: Clear options-flags while issuing stop-firmware mbx command.
[SCSI] qla2xxx: Add an "is reset active" helper.
[SCSI] qla2xxx: Add check for null fcport references in qla2xxx_queuecommand.
[SCSI] qla2xxx: Propagate up abort failures.
[SCSI] isci: Fix NULL ptr dereference when no firmware is being loaded
[SCSI] ipr: fix eeh recovery for 64-bit adapters
[SCSI] mpt2sas: Fix mismatch in mpt2sas_base_hard_reset_handler() mutex lock-unlock
This patch fixes a (mostly cosmetic) bug introduced by the patch
'ppp: Use SKB queue abstraction interfaces in fragment processing'
found here: http://www.spinics.net/lists/netdev/msg153312.html
The above patch rewrote and moved the code responsible for cleaning
up discarded fragments but the new code does not catch every case
where this is necessary. This results in some discarded fragments
remaining in the queue, and triggering a 'bad seq' error on the
subsequent call to ppp_mp_reconstruct. Fragments are discarded
whenever other fragments of the same frame have been lost.
This can generate a lot of unwanted and misleading log messages.
This patch also adds additional detail to the debug logging to
make it clearer which fragments were lost and which other fragments
were discarded as a result of losses. (Run pppd with 'kdebug 1'
option to enable debug logging.)
Signed-off-by: Ben McKeegan <ben@netservers.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>