pte_present might return true on PAGE_TYPE_NONE, even if
the invalid bit is on. Modify the existing check of the
pgste functions to avoid crashes.
[ Martin Schwidefsky: added ptep_modify_prot_[start|commit] bits ]
Reported-by: Martin Schwidefky <schwidefsky@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: stable@vger.kernel.org
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
'buf[2]' is 2 bytes length, and sprintf() will append '\0' at the end
of string "?\n", so original implementation is memory overflow.
Need use strncpy() and strnlen() instead of sprintf().
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
IPIs might be lost when a cpu gets brought offline:
When stop_machine executes its state machine there is a race window
for the state STOPMACHINE_DISABLE_IRQ where the to be brought offline
cpu might already have irqs disabled but a different cpu still may
have irqs enabled.
If the enabled cpu receives an interrupt and as a result sends an IPI
to the to be offlined cpu in its bottom halve context, the IPI won't
be noticed before the cpu is offline.
In fact the race window is much larger since there is no guarantee
when an IPI will be received.
To fix this check for enqueued but not yet received IPIs in the
cpu_disable() path and call the respective handlers before the cpu
is marked offline.
Reported-by: Juergen Doelle <juergen.doelle@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
On s390 the prefix page and absolute zero pages are not correctly
returned when reading /dev/mem. The reason is that the s390 asm/io.h
file includes the asm-generic/io.h file which then defines
xlate_dev_mem_ptr() and therefore overwrites the s390 specific
version that does the correct swap operation for prefix and absolute
zero pages. The problem is a regression that was introduced with git
commit cd248341 (s390/pci: base support).
To fix the problem add "#ifndef xlate_dev_mem_ptr" in asm-generic/io.h
and "#define xlate_dev_mem_ptr" in asm/io.h. This ensures that the
s390 version is used. For completeness also add the "#ifndef"
construct for xlate_dev_kmem_ptr().
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
In dma_free_coherent call debug_dma_free_coherent before deallocating
the memory to avoid a possible use after free.
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pull s390 update from Martin Schwidefsky:
"An additional sysfs attribute for channel paths and a couple of bux
fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/pgtable: fix ipte notify bit
s390/xpram: mark xpram as non-rotational
s390/smp: fix cpu re-scan vs. cpu state
s390/cio: add channel ID sysfs attribute
s390/ftrace: fix mcount adjustment
s390: fix gmap_ipte_notifier vs. software dirty pages
s390: disable pfmf for clear page instruction
s390/disassembler: prevent endless loop in print_fn_code()
s390: remove non existent reference to GENERIC_KERNEL_THREAD
Before f7b861b7a6 ("arm: Use generic idle loop") ARM would kill the
CPU within the rcu idle section. Now that the rcu_idle_enter()/exit()
pair have been pushed lower down in the idle loop this is no longer true
and so using RCU_NONIDLE here is no longer necessary and also harmful
because RCU is not actually idle at this point.
Cc: Russell King <linux@arm.linux.org.uk>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In commit 78d77df715 ("x86-64, init: Do not set NX bits on non-NX
capable hardware") we added the early_pmd_flags that gets the NX bit set
when a CPU supports NX. However, the new variable was marked __initdata,
because the main _use_ of this is in an __init routine.
However, the bit setting happens from secondary_startup_64(), which is
called not only at bootup, but on every secondary CPU start. Including
resuming from STR and at CPU hotplug time. So the value cannot be
__initdata.
Reported-bisected-and-tested-by: Michal Hocko <mhocko@suse.cz>
Cc: stable@vger.kernel.org # v3.9
Acked-by: Peter Anvin <hpa@linux.intel.com>
Cc: Fernando Luis Vázquez Cao <fernando@oss.ntt.co.jp>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull devm usage cleanup from Wolfram Sang:
"Lately, I have been experimenting how to improve the devm interface to
make writing device drivers easier and less error prone while also
getting rid of its subtle issues. I think it has more potential but
still needs work and definately conistency, especiall in its usage.
The first thing I come up with is a low hanging fruit regarding
devm_ioremap_resouce(). This function already checks if the passed
resource is valid and gives an error message if not. So, we can
remove similar checks from the drivers and get rid of a bit of code
and a number of inconsistent error strings.
This series only removes the unneeded check iff devm_ioremap_resource
follows platform_get_resource directly. The previous version tried to
shuffle code if needed, too, what lead to an embarrasing bug. It
turned out to me that shuffling code for all cases found will make the
automated script too complex, so I am unsure if an automated cleanup
is the proper tool for this case. Removing the easy stuff seems
worthwhile to me, though.
Despite various architectures and platform dependencies, I managed to
compile test 45 out of 57 modified files locally using heuristics and
defconfigs."
Pulled because: 296 deletions, 0 additions.
* 'devm_no_resource_check' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (33 commits)
sound/soc/kirkwood: don't check resource with devm_ioremap_resource
sound/soc/fsl: don't check resource with devm_ioremap_resource
arch/mips/lantiq/xway: don't check resource with devm_ioremap_resource
arch/arm/plat-samsung: don't check resource with devm_ioremap_resource
arch/arm/mach-tegra: don't check resource with devm_ioremap_resource
drivers/watchdog: don't check resource with devm_ioremap_resource
drivers/w1/masters: don't check resource with devm_ioremap_resource
drivers/video/omap2/dss: don't check resource with devm_ioremap_resource
drivers/video/omap2: don't check resource with devm_ioremap_resource
drivers/usb/phy: don't check resource with devm_ioremap_resource
drivers/usb/host: don't check resource with devm_ioremap_resource
drivers/usb/gadget: don't check resource with devm_ioremap_resource
drivers/usb/chipidea: don't check resource with devm_ioremap_resource
drivers/thermal: don't check resource with devm_ioremap_resource
drivers/staging/nvec: don't check resource with devm_ioremap_resource
drivers/staging/dwc2: don't check resource with devm_ioremap_resource
drivers/spi: don't check resource with devm_ioremap_resource
drivers/rtc: don't check resource with devm_ioremap_resource
drivers/pwm: don't check resource with devm_ioremap_resource
drivers/pinctrl: don't check resource with devm_ioremap_resource
...
Pull MIPS fixes from Ralf Baechle:
"Patching up across the field. The reversion of the two ASID patches
is particularly important as it was breaking many platforms."
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
MIPS: ralink: use the dwc2 driver for the rt305x USB controller
MIPS: Extract schedule_mfi info from __schedule
MIPS: Fix sibling call handling in get_frame_info
MIPS: MSP71xx: remove inline marking of EXPORT_SYMBOL functions
MIPS: Make virt_to_phys() work for all unmapped addresses.
MIPS: Fix build error for crash_dump.c in 3.10-rc1
MIPS: Xway: Fix clk leak
Revert "MIPS: Allow ASID size to be determined at boot time."
Revert "MIPS: microMIPS: Support dynamic ASID sizing."
include, __flush_dcache_all() set/way computing, debug (locking, bit
testing). The of_platform_populate() was moved to an arch_init_call() to
allow subsys_init_call() drivers to probe the DT.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
iQIcBAABAgAGBQJRlmhKAAoJEGvWsS0AyF7xjTcQAJZgLV4d3x2cf5n57Y9WmLmF
ivClmSoiDUWzWRWZp8RGbG4A2P9EUCGSuSFP1nA6jSyRYFq8483jibtejP6+h5qK
wGVoBN9T171CuOdxK1lWDqV/r0ALHgYDRkWLRL1Kuc3Jsmi3fNZvnV6vOsSezrQd
pEtTHX5TR0ltQPEI1uKqpKJF7tis8Ary1XkmsAtAfWW3cbqxkuPhbj+1GpMdsI4h
f/OR1vxR4c1JqFdcCdehfei+WC94wVUHAnuUj9LxTYXmy8x/ztKsZuXPY3Bxd/4w
0ag7nzlQyb3//ociqHN18THr7ftUtkOTLMCd01dqmf4edG7tqxVpgLuKDZ0qD+JA
P0qJ5yBcnEb7pH/KNRZu2Xbrio3n5PFpVB0RfJiu+wWRLKV+PZWYXXaS/qb+OvVq
kMVco2MguXeyuHAuxlIUpC0hqCaevJIPeqwAOP1TqQvNrnm/Q/W99pXmzZY5/JgY
PJpa4SIERWb/8FL0eqKXro+lcmoeanBqHWehxvb9qie+bEeANbaX4jImmScOOqqU
a0DXWhF3AWh1lyuyBEfxIvNzy2o6PGZh38eQp1obSrM39I2SjO+BCBeSMZn18Ojh
zafc60MbiAu5YukIClHZMSSlYGUfvz4Vi4bq4R5tNUvjx5PmOwKQBKVxRYICrdYi
RJow+92Be+WhTuSpMa+Z
=Ewl8
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64
Pull arm64 fixes from Catalin Marinas:
"Fixes for duplicate definition of early_console, kernel/time/Kconfig
include, __flush_dcache_all() set/way computing, debug (locking, bit
testing). The of_platform_populate() was moved to an arch_init_call()
to allow subsys_init_call() drivers to probe the DT."
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64:
arm64: debug: fix mdscr.ss check when enabling debug exceptions
arm64: Do not source kernel/time/Kconfig explicitly
arm64: mm: Fix operands of clz in __flush_dcache_all
arm64: Invoke the of_platform_populate() at arch_initcall() level
arm64: debug: clear mdscr_el1 instead of taking the OS lock
arm64: Fix duplicate definition of early_console
devm_ioremap_resource does sanity checks on the given resource. No need to
duplicate this in the driver.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: John Crispin <blogic@openwrt.org>
devm_ioremap_resource does sanity checks on the given resource. No need to
duplicate this in the driver.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
devm_ioremap_resource does sanity checks on the given resource. No need to
duplicate this in the driver.
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Stephen Warren <swarren@nvidia.com>
This sets up the devicetree file for the rt3050 chip series and rt3052
eval board to use the right compatible string for the dwc2 driver.
Acked-by: John Crispin <blogic@openwrt.org>
Cc: blogic@openwrt.org
Cc: linux-mips@linux-mips.org
Cc: Matthijs Kooijman <matthijs@stdin.nl>
Patchwork: https://patchwork.linux-mips.org/patch/5226/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
schedule_mfi is supposed to be extracted from schedule(), and
is used in thread_saved_pc and get_wchan.
But, after optimization, schedule() is reduced to a sibling
call to __schedule(), and no real frame info can be extracted.
One solution is to compile schedule() with -fno-omit-frame-pointer
and -fno-optimize-sibling-calls, but that will incur performance
degradation.
Another solution is to extract info from the real scheduler,
__schedule, and this is the approache adopted here.
This patch reads the __schedule address by either following
the 'j' call in schedule if KALLSYMS is disabled or by using
kallsyms_lookup_name to lookup __schedule if KALLSYMS is
available, then, extracts schedule_mfi from __schedule frame info.
This patch also fixes the "Can't analyze schedule() prologue"
warning at boot time.
Signed-off-by: Tony Wu <tung7970@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5237/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Given a function, get_frame_info() analyzes its instructions
to figure out frame size and return address. get_frame_info()
works as follows:
1. analyze up to 128 instructions if the function size is unknown
2. search for 'addiu/daddiu sp,sp,-immed' for frame size
3. search for 'sw ra,offset(sp)' for return address
4. end search when it sees jr/jal/jalr
This leads to an issue when the given function is a sibling
call, example shown as follows.
801ca110 <schedule>:
801ca110: 8f820000 lw v0,0(gp)
801ca114: 8c420000 lw v0,0(v0)
801ca118: 080726f0 j 801c9bc0 <__schedule>
801ca11c: 00000000 nop
801ca120 <io_schedule>:
801ca120: 27bdffe8 addiu sp,sp,-24
801ca124: 3c028022 lui v0,0x8022
801ca128: afbf0014 sw ra,20(sp)
In this case, get_frame_info() cannot properly detect schedule's
frame info, and eventually returns io_schedule's instead.
This patch adds 'j' to the end search condition to workaround
sibling call cases.
Signed-off-by: Tony Wu <tung7970@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/5236/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
As reported:
This problem was discovered when doing BGP traffic with the TCP MD5 option
activated, where the following call chain caused a crash:
* tcp_v4_rcv
* tcp_v4_timewait_ack
* tcp_v4_send_ack -> follow stack variable rep.th
* tcp_v4_md5_hash_hdr
* tcp_md5_hash_header
* sg_init_one
* sg_set_buf
* virt_to_page
I noticed that tcp_v4_send_reset uses a similar stack variable and
also calls tcp_v4_md5_hash_hdr, so it has the same problem.
The networking core can indirectly call virt_to_phys() on stack
addresses, if this is done from PID 0, the stack will usually be in
CKSEG0, so virt_to_phys() needs to work there as well
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: Jiang Liu <liuj97@gmail.com>
Cc: eunb.song@samsung.com
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/5220/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch fixes crash_dump.c build error. Build error logs are as follow.
arch/mips/kernel/crash_dump.c: In function 'kdump_buf_page_init':
arch/mips/kernel/crash_dump.c:67: error: implicit declaration of function 'kmalloc'
arch/mips/kernel/crash_dump.c:67: error: assignment makes pointer from integer without a cast
Signed-off-by: EunBong Song <eunb.song@samsung.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/5238/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
When we take an exception at EL1, we only want to enable debug
exceptions if we're not currently stepping, otherwise we can easily get
stuck in a loop stepping into interrupt handlers.
Unfortunately, the current code tests the wrong bit in the mdscr, so fix
that.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The cpu-info array starts with a list of cpus in configured state,
followed by the cpus in standby state. The comparison to decide which
state a cpu has is incorrect, this causes configured cpus appear as
standby cpus. The correct comparison is the index of the new cpu in
the cpu-info array vs. the number of configured cpus.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
This reverts commit d532f3d267.
The original commit has several problems:
1) Doesn't work with 64-bit kernels.
2) Calls TLBMISS_HANDLER_SETUP() before the code is generated.
3) Calls TLBMISS_HANDLER_SETUP() twice in per_cpu_trap_init() when
only one call is needed.
[ralf@linux-mips.org: Also revert the bits of the ASID patch which were
hidden in the KVM merge.]
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Cc: "Steven J. Hill" <Steven.Hill@imgtec.com>
Cc: David Daney <david.daney@cavium.com>
Patchwork: https://patchwork.linux-mips.org/patch/5242/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Pull x86 fixes from Thomas Gleixner:
- Fix for a CPU hot-add deadlock in microcode update code
- Fix for idle consolidation fallout
- Documentation update for initial kernel direct mapping
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Add missing comments for initial kernel direct mapping
x86/microcode: Add local mutex to fix physical CPU hot-add deadlock
x86: Fix idle consolidation fallout
Pull timer fixes from Thomas Gleixner:
- Cure for not using zalloc in the first place, which leads to random
crashes with CPUMASK_OFF_STACK.
- Revert a user space visible change which broke udev
- Add a missing cpu_online early return introduced by the new full
dyntick conversions
- Plug a long standing race in the timer wheel cpu hotplug code.
Sigh...
- Cleanup NOHZ per cpu data on cpu down to prevent stale data on cpu
up.
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
time: Revert ALWAYS_USE_PERSISTENT_CLOCK compile time optimizaitons
timer: Don't reinitialize the cpu base lock during CPU_UP_PREPARE
tick: Don't invoke tick_nohz_stop_sched_tick() if the cpu is offline
tick: Cleanup NOHZ per cpu data on cpu down
tick: Use zalloc_cpumask_var for allocating offstack cpumasks
Pull core fixes from Thomas Gleixner:
- Two fixlets for the fallout of the generic idle task conversion
- Documentation update
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu/idle: Wrap cpu-idle poll mode within rcu_idle_enter/exit
idle: Fix hlt/nohlt command-line handling in new generic idle
kthread: Document ways of reducing OS jitter due to per-CPU kthreads
Pull ARM fixes from Russell King:
"A small number of fixes for stuff from the last merge window, and in
one case (IRQ time accounting) the previous merge window."
* 'fixes' of git://git.linaro.org/people/rmk/linux-arm:
ARM: 7720/1: ARM v6/v7 cmpxchg64 shouldn't clear upper 32 bits of the old/new value
ARM: 7715/1: MCPM: adapt to GIC changes after upstream merge
ARM: 7714/1: mmc: mmci: Ensure return value of regulator_enable() is checked
ARM: 7712/1: Remove trailing whitespace in arch/arm/Makefile
ARM: 7711/1: dove: fix Dove cpu type from V7 to PJ4
ARM: finally enable IRQ time accounting config
Tony Jones reported that the ftrace self tests on s390 do not work:
<6>Testing dynamic ftrace ops #1: (0 0 0 0 0) FAILED!
<6>Testing tracer irqsoff:
<3>failed to start irqsoff tracer
<4>.. no entries found ..FAILED!
<6>Testing tracer wakeup:
<3>failed to start wakeup tracer
<4>.. no entries found ..FAILED!
<6>Testing tracer function_graph:
<4>Failed to init function_graph tracer, init returned -19
<4>FAILED!
This happens because we forgot to adjust the instruction pointer that gets
passed to the ftrace trace function by MCOUNT_INSN_SIZE.
In addition change MCOUNT_INSN_SIZE to the correct value on 31 bit.
It only worked so far because the to be patched instruction was identical.
Reported-by: Tony Jones <tonyj@suse.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
On heavy paging load some guest cpus started to loop in gmap_ipte_notify.
This was visible as stalled cpus inside the guest. The gmap_ipte_notifier
tries to map a user page and then made sure that the pte is valid and
writable. Turns out that with the software change bit tracking the pte
can become read-only (and only software writable) if the page is clean.
Since we loop in this code, the page would stay clean and, therefore,
be never writable again.
Let us just use fixup_user_fault, that guarantees to call handle_mm_fault.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Kay Sievers noted that the ALWAYS_USE_PERSISTENT_CLOCK config,
which enables some minor compile time optimization to avoid
uncessary code in mostly the suspend/resume path could cause
problems for userland.
In particular, the dependency for RTC_HCTOSYS on
!ALWAYS_USE_PERSISTENT_CLOCK, which avoids setting the time
twice and simplifies suspend/resume, has the side effect
of causing the /sys/class/rtc/rtcN/hctosys flag to always be
zero, and this flag is commonly used by udev to setup the
/dev/rtc symlink to /dev/rtcN, which can cause pain for
older applications.
While the udev rules could use some work to be less fragile,
breaking userland should strongly be avoided. Additionally
the compile time optimizations are fairly minor, and the code
being optimized is likely to be reworked in the future, so
lets revert this change.
Reported-by: Kay Sievers <kay@vrfy.org>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Cc: stable <stable@vger.kernel.org> #3.9
Cc: Feng Tang <feng.tang@intel.com>
Cc: Jason Gunthorpe <jgunthorpe@obsidianresearch.com>
Link: http://lkml.kernel.org/r/1366828376-18124-1-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
As per commit 764e0da1 (timers: Fixup the Kconfig consolidation
fallout), init/Kconfig already includes kernel/time/Kconfig, so no need
to do it explicitly for arm64.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
The format of the lower 32-bits of the 64-bit operand to 'dc cisw' is
unchanged from ARMv7 architecture and the upper bits are RES0. This
implies that the 'way' field of the operand of 'dc cisw' occupies the
bit-positions [31 .. (32-A)]. Due to the use of 64-bit extended operands
to 'clz', the existing implementation of __flush_dcache_all is incorrectly
placing the 'way' field in the bit-positions [63 .. (64-A)].
Signed-off-by: Sukanto Ghosh <sghosh@apm.com>
Tested-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: stable@vger.kernel.org
The of_platform_populate() is currently invoked at device_initcall()
level. There are however drivers that use platform_driver_probe()
directly and they need the devices to be populated. This patch makes the
of_platform_populate() and arch_initcall().
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Benoit Lecardonnel <Benoit.Lecardonnel@synopsys.com>
Tested-by: Benoit Lecardonnel <Benoit.Lecardonnel@synopsys.com>
Pull powerpc fixes from Benjamin Herrenschmidt:
"This is mostly bug fixes (some of them regressions, some of them I
deemed worth merging now) along with some patches from Li Zhong
hooking up the new context tracking stuff (for the new full NO_HZ)"
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (25 commits)
powerpc: Set show_unhandled_signals to 1 by default
powerpc/perf: Fix setting of "to" addresses for BHRB
powerpc/pmu: Fix order of interpreting BHRB target entries
powerpc/perf: Move BHRB code into CONFIG_PPC64 region
powerpc: select HAVE_CONTEXT_TRACKING for pSeries
powerpc: Use the new schedule_user API on userspace preemption
powerpc: Exit user context on notify resume
powerpc: Exception hooks for context tracking subsystem
powerpc: Syscall hooks for context tracking subsystem
powerpc/booke64: Fix kernel hangs at kernel_dbg_exc
powerpc: Fix irq_set_affinity() return values
powerpc: Provide __bswapdi2
powerpc/powernv: Fix starting of secondary CPUs on OPALv2 and v3
powerpc/powernv: Detect OPAL v3 API version
powerpc: Fix MAX_STACK_TRACE_ENTRIES too low warning again
powerpc: Make CONFIG_RTAS_PROC depend on CONFIG_PROC_FS
powerpc: Bring all threads online prior to migration/hibernation
powerpc/rtas_flash: Fix validate_flash buffer overflow issue
powerpc/kexec: Fix kexec when using VMX optimised memcpy
powerpc: Fix build errors STRICT_MM_TYPECHECKS
...
Currently we only set the "to" address in the branch stack when the CPU
explicitly gives us a value. Unfortunately it only does this for XL form
branches (eg blr, bctr, bctar) and not I and B form branches (eg b, bc).
Fortunately if we read the instruction from memory we can extract the offset of
a branch and calculate the target address.
This adds a function power_pmu_bhrb_to() to calculate the target/to address of
the corresponding I and B form branches. It handles branches in both user and
kernel spaces. It also plumbs this into the perf brhb reading code.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The current Branch History Rolling Buffer (BHRB) code misinterprets the order
of entries in the hardware buffer. It assumes that a branch target address
will be read _after_ its corresponding branch. In reality the branch target
comes before (lower mfbhrb entry) it's corresponding branch.
This is a rewrite of the code to take this into account.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The new Branch History Rolling buffer (BHRB) code is only useful on 64bit
processors, so move it into the #ifdef CONFIG_PPC64 region.
This avoids code bloat on 32bit systems.
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Start context tracking support from pSeries.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch corresponds to
[PATCH] x86: Use the new schedule_user API on userspace preemption
commit 0430499ce9
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch allows RCU usage in do_notify_resume, e.g. signal handling.
It corresponds to
[PATCH] x86: Exit RCU extended QS on notify resume
commit edf55fda35
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is the exception hooks for context tracking subsystem, including
data access, program check, single step, instruction breakpoint, machine check,
alignment, fp unavailable, altivec assist, unknown exception, whose handlers
might use RCU.
This patch corresponds to
[PATCH] x86: Exception hooks for userspace RCU extended QS
commit 6ba3c97a38
But after the exception handling moved to generic code, and some changes in
following two commits:
56dd9470d7
context_tracking: Move exception handling to generic code
6c1e0256fa
context_tracking: Restore correct previous context state on exception exit
it is able for exception hooks to use the generic code above instead of a
redundant arch implementation.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This is the syscall slow path hooks for context tracking subsystem,
corresponding to
[PATCH] x86: Syscall hooks for userspace RCU extended QS
commit bf5a3c13b9
TIF_MEMDIE is moved to the second 16-bits (with value 17), as it seems there
is no asm code using it. TIF_NOHZ is added to _TIF_SYCALL_T_OR_A, so it is
better for it to be in the same 16 bits with others in the group, so in the
asm code, andi. with this group could work.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
MSR_DE is not cleared on entry to the kernel, and we don't clear it
explicitly outside of debug code. If we have MSR_DE set in
prime_debug_regs(), and the new thread has events enabled in DBCR0
(e.g. ICMP is set in thread->dbsr0, even though it was cleared in the
real DBCR0 when the thread got scheduled out), we'll end up taking a
debug exception in the kernel when DBCR0 is loaded. DSRR0 will not
point to an exception vector, and the kernel ends up hanging at
kernel_dbg_exc. Fix this by always clearing MSR_DE when we load new
debug state.
Another observed source of kernel_dbg_exc hangs is with the branch
taken event. If this event is active, but we take a non-debug trap
(e.g. a TLB miss or an asynchronous interrupt) before the next branch.
We end up taking a branch-taken debug exception on the initial branch
instruction of the exception vector, but because the debug exception is
DBSR_BT rather than DBSR_IC we branch to kernel_dbg_exc before even
checking the DSRR0 address. Fix this by checking for DBSR_BT as well
as DBSR_IC, which is what 32-bit does and what the comments suggest was
intended in the 64-bit code as well.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Some versions of GCC apparently expect this to be provided by libgcc.
Updates from Mikey to fix 32 bit version and adding "r" to registers.
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
The current code fails to handle kexec on OPALv2. This fixes it
and adds code to improve the situation on OPALv3 where we can
query the CPU status from the firmware and decide what to do
based on that.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>