Use standard __init_begin and __init_end instead.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The nohz full code needs irq work to trigger its own interrupt so that
the subsystem can work even when the tick is stopped.
Lets introduce arch_irq_work_has_interrupt() that archs can override to
tell about their support for this ability.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x). This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.
Other use cases are for storing and retrieving data from the current
processors percpu area. __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.
__get_cpu_var() is defined as :
#define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.
this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.
This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset. Thereby address calculations are avoided and less registers
are used when code is generated.
At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.
The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e. using a global
register that may be set to the per cpu base.
Transformations done to __get_cpu_var()
1. Determine the address of the percpu instance of the current processor.
DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(&y);
2. Same as #1 but this time an array structure is involved.
DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(y);
3. Retrieve the content of the current processors instance of a per cpu
variable.
DEFINE_PER_CPU(int, y);
int x = __get_cpu_var(y)
Converts to
int x = __this_cpu_read(y);
4. Retrieve the content of a percpu struct
DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);
Converts to
memcpy(&x, this_cpu_ptr(&y), sizeof(x));
5. Assignment to a per cpu variable
DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;
Converts to
__this_cpu_write(y, x);
6. Increment/Decrement etc of a per cpu variable
DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++
Converts to
__this_cpu_inc(y)
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Pull arch signal handling cleanup from Richard Weinberger:
"This patch series moves all remaining archs to the get_signal(),
signal_setup_done() and sigsp() functions.
Currently these archs use open coded variants of the said functions.
Further, unused parameters get removed from get_signal_to_deliver(),
tracehook_signal_handler() and signal_delivered().
At the end of the day we save around 500 lines of code."
* 'signal-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/misc: (43 commits)
powerpc: Use sigsp()
openrisc: Use sigsp()
mn10300: Use sigsp()
mips: Use sigsp()
microblaze: Use sigsp()
metag: Use sigsp()
m68k: Use sigsp()
m32r: Use sigsp()
hexagon: Use sigsp()
frv: Use sigsp()
cris: Use sigsp()
c6x: Use sigsp()
blackfin: Use sigsp()
avr32: Use sigsp()
arm64: Use sigsp()
arc: Use sigsp()
sas_ss_flags: Remove nested ternary if
Rip out get_signal_to_deliver()
Clean up signal_delivered()
tracehook_signal_handler: Remove sig, info, ka and regs
...
The core mm code will provide a default gate area based on
FIXADDR_USER_START and FIXADDR_USER_END if
!defined(__HAVE_ARCH_GATE_AREA) && defined(AT_SYSINFO_EHDR).
This default is only useful for ia64. arm64, ppc, s390, sh, tile, 64-bit
UML, and x86_32 have their own code just to disable it. arm, 32-bit UML,
and x86_64 have gate areas, but they have their own implementations.
This gets rid of the default and moves the code into ia64.
This should save some code on architectures without a gate area: it's now
possible to inline the gate_area functions in the default case.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Acked-by: Nathan Lynch <nathan_lynch@mentor.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [in principle]
Acked-by: Richard Weinberger <richard@nod.at> [for um]
Acked-by: Will Deacon <will.deacon@arm.com> [for arm64]
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Nathan Lynch <Nathan_Lynch@mentor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the more generic functions get_signal() signal_setup_done()
for signal delivery.
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
The arch_mutex_cpu_relax() function, introduced by 34b133f, is
hacky and ugly. It was added a few years ago to address the fact
that common cpu_relax() calls include yielding on s390, and thus
impact the optimistic spinning functionality of mutexes. Nowadays
we use this function well beyond mutexes: rwsem, qrwlock, mcs and
lockref. Since the macro that defines the call is in the mutex header,
any users must include mutex.h and the naming is misleading as well.
This patch (i) renames the call to cpu_relax_lowlatency ("relax, but
only if you can do it with very low latency") and (ii) defines it in
each arch's asm/processor.h local header, just like for regular cpu_relax
functions. On all archs, except s390, cpu_relax_lowlatency is simply cpu_relax,
and thus we can take it out of mutex.h. While this can seem redundant,
I believe it is a good choice as it allows us to move out arch specific
logic from generic locking primitives and enables future(?) archs to
transparently define it, similarly to System Z.
Signed-off-by: Davidlohr Bueso <davidlohr@hp.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bharat Bhushan <r65777@freescale.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Deepthi Dharwar <deepthi@linux.vnet.ibm.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Cc: Haavard Skinnemoen <hskinnemoen@gmail.com>
Cc: Hans-Christian Egtvedt <egtvedt@samfundet.no>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Jesper Nilsson <jesper.nilsson@axis.com>
Cc: Joe Perches <joe@perches.com>
Cc: Jonas Bonn <jonas@southpole.se>
Cc: Joseph Myers <joseph@codesourcery.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Mikael Starvik <starvik@axis.com>
Cc: Nicolas Pitre <nico@linaro.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Qais Yousef <qais.yousef@imgtec.com>
Cc: Qiaowei Ren <qiaowei.ren@intel.com>
Cc: Rafael Wysocki <rafael.j.wysocki@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Steven Miao <realmz6@gmail.com>
Cc: Steven Rostedt <srostedt@redhat.com>
Cc: Stratos Karafotis <stratosk@semaphore.gr>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Kulikov <segoon@openwall.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Vineet Gupta <Vineet.Gupta1@synopsys.com>
Cc: Waiman Long <Waiman.Long@hp.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Wolfram Sang <wsa@the-dreams.de>
Cc: adi-buildroot-devel@lists.sourceforge.net
Cc: linux390@de.ibm.com
Cc: linux-alpha@vger.kernel.org
Cc: linux-am33-list@redhat.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: linux-cris-kernel@axis.com
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux@lists.openrisc.net
Cc: linux-m32r-ja@ml.linux-m32r.org
Cc: linux-m32r@ml.linux-m32r.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-metag@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: sparclinux@vger.kernel.org
Link: http://lkml.kernel.org/r/1404079773.2619.4.camel@buesod1.americas.hpqcorp.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull arch/tile changes from Chris Metcalf:
"These mostly just address smaller issues reported to me"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch: tile: kernel: unaligned.c: Cleaning up uninitialized variables
drivers/tty/hvc/hvc_tile.c: use PTR_ERR_OR_ZERO
replace strict_strto* call with kstrto*
tile: Update comments for generic idle conversion
tile: cleanup the comment in init_pgprot
tile: use BOOTMEM_DEFAULT instead of magic number 0 for reserve_bootmem flags
Pull core irq updates from Thomas Gleixner:
"The irq department delivers:
- Another tree wide update to get rid of the horrible create_irq
interface along with its even more horrible variants. That also
gets rid of the last leftovers of the initial sparse irq hackery.
arch/driver specific changes have been either acked or ignored.
- A fix for the spurious interrupt detection logic with threaded
interrupts.
- A new ARM SoC interrupt controller
- The usual pile of fixes and improvements all over the place"
* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
Documentation: brcmstb-l2: Add Broadcom STB Level-2 interrupt controller binding
irqchip: brcmstb-l2: Add Broadcom Set Top Box Level-2 interrupt controller
genirq: Improve documentation to match current implementation
ARM: iop13xx: fix msi support with sparse IRQ
genirq: Provide !SMP stub for irq_set_affinity_notifier()
irqchip: armada-370-xp: Move the devicetree binding documentation
irqchip: gic: Use mask field in GICC_IAR
genirq: Remove dynamic_irq mess
ia64: Use irq_init_desc
genirq: Replace dynamic_irq_init/cleanup
genirq: Remove irq_reserve_irq[s]
genirq: Replace reserve_irqs in core code
s390: Avoid call to irq_reserve_irqs()
s390: Remove pointless arch_show_interrupts()
s390: pci: Check return value of alloc_irq_desc() proper
sh: intc: Remove pointless irq_reserve_irqs() invocation
x86, irq: Remove pointless irq_reserve_irqs() call
genirq: Make create/destroy_irq() ia64 private
tile: Use SPARSE_IRQ
tile: pci: Use irq_alloc/free_hwirq()
...
Pull scheduler updates from Ingo Molnar:
"The main scheduling related changes in this cycle were:
- various sched/numa updates, for better performance
- tree wide cleanup of open coded nice levels
- nohz fix related to rq->nr_running use
- cpuidle changes and continued consolidation to improve the
kernel/sched/idle.c high level idle scheduling logic. As part of
this effort I pulled cpuidle driver changes from Rafael as well.
- standardized idle polling amongst architectures
- continued work on preparing better power/energy aware scheduling
- sched/rt updates
- misc fixlets and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (49 commits)
sched/numa: Decay ->wakee_flips instead of zeroing
sched/numa: Update migrate_improves/degrades_locality()
sched/numa: Allow task switch if load imbalance improves
sched/rt: Fix 'struct sched_dl_entity' and dl_task_time() comments, to match the current upstream code
sched: Consolidate open coded implementations of nice level frobbing into nice_to_rlimit() and rlimit_to_nice()
sched: Initialize rq->age_stamp on processor start
sched, nohz: Change rq->nr_running to always use wrappers
sched: Fix the rq->next_balance logic in rebalance_domains() and idle_balance()
sched: Use clamp() and clamp_val() to make sys_nice() more readable
sched: Do not zero sg->cpumask and sg->sgp->power in build_sched_groups()
sched/numa: Fix initialization of sched_domain_topology for NUMA
sched: Call select_idle_sibling() when not affine_sd
sched: Simplify return logic in sched_read_attr()
sched: Simplify return logic in sched_copy_attr()
sched: Fix exec_start/task_hot on migrated tasks
arm64: Remove TIF_POLLING_NRFLAG
metag: Remove TIF_POLLING_NRFLAG
sched/idle: Make cpuidle_idle_call() void
sched/idle: Reflow cpuidle_idle_call()
sched/idle: Delay clearing the polling bit
...
As of commit 0dc8153cfe ("tile: Use generic
idle loop"), this applies to arch_cpu_idle() instead of cpu_idle().
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Get rid of the private allocator and switch over to sparse IRQs.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Link: http://lkml.kernel.org/r/20140507154338.423715783@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We want to convert the drivers over to the new interface and finally
tile to sparse irqs. Implement irq_alloc/free_hwirq() for step by step
migration.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Grant Likely <grant.likely@linaro.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Link: http://lkml.kernel.org/r/20140507154336.947853241@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Standardize the idle polling indicator to TIF_POLLING_NRFLAG such that
both TIF_NEED_RESCHED and TIF_POLLING_NRFLAG are in the same word.
This will allow us, using fetch_or(), to both set NEED_RESCHED and
check for POLLING_NRFLAG in a single operation and avoid pointless
wakeups.
Changing from the non-atomic thread_info::status flags to the atomic
thread_info::flags shouldn't be a big issue since most polling state
changes were followed/preceded by a full memory barrier anyway.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/n/tip-35zzwlvwr7cp8xj196y10yyx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We replace the old way to configure the scheduler topology with a new method
which enables a platform to declare additionnal level (if needed).
We still have a default topology table definition that can be used by platform
that don't want more level than the SMT, MC, CPU and NUMA ones. This table can
be overwritten by an arch which either wants to add new level where a load
balance make sense like BOOK or powergating level or wants to change the flags
configuration of some levels.
For each level, we need a function pointer that returns cpumask for each cpu,
a function pointer that returns the flags for the level and a name. Only flags
that describe topology, can be set by an architecture. The current topology
flags are:
SD_SHARE_CPUPOWER
SD_SHARE_PKG_RESOURCES
SD_NUMA
SD_ASYM_PACKING
Then, each level must be a subset on the next one. The build sequence of the
sched_domain will take care of removing useless levels like those with 1 CPU
and those with the same CPU span and no more relevant information for
load balancing than its children.
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hanjun Guo <hanjun.guo@linaro.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jason Low <jason.low2@hp.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux390@de.ibm.com
Cc: linux-ia64@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Link: http://lkml.kernel.org/r/1397209481-28542-2-git-send-email-vincent.guittot@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Implement the new smp_mb__* ops as per the old ones.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Link: http://lkml.kernel.org/n/tip-euuabnf5a3u23fy4fq8m3jcg@git.kernel.org
Cc: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Chen Gang <gang.chen@asianux.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull arch/tile updates from Chris Metcalf:
"These fix a few stray build issues seen in linux-next, and also add
the minimal required support for perf to tilegx"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: remove unused variable 'devcap'
tile: Fix vDSO compilation issue with allyesconfig
perf tools: Allow building for tile
tile/perf: Support perf_events on tilegx and tilepro
tile: Enable NMIs on return from handle_nmi() without errors
tile: Add support for handling PMC hardware
tile: don't use __get_cpu_var() with structure-typed arguments
tile: avoid overflow in ns2cycles
The PMC module is used by perf_events, oprofile and watchdogs.
Signed-off-by: Zhigang Lu <zlu@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This patch allows each architecture to add its specific assembly optimized
arch_mcs_spin_lock_contended and arch_mcs_spinlock_uncontended for
MCS lock and unlock functions.
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: AswinChandramouleeswaran <aswin@hp.com>
Cc: George Spelvin <linux@horizon.com>
Cc: Rik vanRiel <riel@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: MichelLespinasse <walken@google.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Figo.zhang" <figo1802@gmail.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matthew R Wilcox <matthew.r.wilcox@intel.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1390347382.3138.67.camel@schen9-DESK
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We perform a clean up of the Kbuid files in each architecture.
We order the files in each Kbuild in alphabetical order
by running the below script.
for i in arch/*/include/asm/Kbuild
do
cat $i | gawk '/^generic-y/ {
i = 3;
do {
for (; i <= NF; i++) {
if ($i == "\\") {
getline;
i = 1;
continue;
}
if ($i != "")
hdr[$i] = $i;
}
break;
} while (1);
next;
}
// {
print $0;
}
END {
n = asort(hdr);
for (i = 1; i <= n; i++)
print "generic-y += " hdr[i];
}' > ${i}.sorted;
mv ${i}.sorted $i;
done
Signed-off-by: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Matthew R Wilcox <matthew.r.wilcox@intel.com>
Cc: AswinChandramouleeswaran <aswin@hp.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: "Paul E.McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Scott J Norton <scott.norton@hp.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: "Figo.zhang" <figo1802@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Waiman Long <waiman.long@hp.com>
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Alex Shi <alex.shi@linaro.org>
Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: George Spelvin <linux@horizon.com>
Cc: MichelLespinasse <walken@google.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Davidlohr Bueso <davidlohr.bueso@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
[ Fixed build bug. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
With commit d8d14bd09c ("fs/compat: fix lookup_dcookie() parameter
handling") I changed the type of the len parameter of the
lookup_dcookie() syscall.
However I missed that there was still a stale declaration in
arch/tile/.. which now causes a compile error on tile:
In file included from fs/dcookies.c:28:0:
include/linux/compat.h:425:17: error: conflicting types for 'compat_sys_lookup_dcookie'
fs/dcookies.c:207:1: error: conflicting types for 'compat_sys_lookup_dcookie'
Simply remove the declaration in the tile architecture, which is only a
leftover from before the different compat lookup_dcookie() versions have
been merged. The correct declaration is now in include/linux/compat.h
The build error was reported by Fenguang's build bot.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull networking updates from David Miller:
1) BPF debugger and asm tool by Daniel Borkmann.
2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann.
3) Correct reciprocal_divide and update users, from Hannes Frederic
Sowa and Daniel Borkmann.
4) Currently we only have a "set" operation for the hw timestamp socket
ioctl, add a "get" operation to match. From Ben Hutchings.
5) Add better trace events for debugging driver datapath problems, also
from Ben Hutchings.
6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we
have a small send and a previous packet is already in the qdisc or
device queue, defer until TX completion or we get more data.
7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko.
8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel
Borkmann.
9) Share IP header compression code between Bluetooth and IEEE802154
layers, from Jukka Rissanen.
10) Fix ipv6 router reachability probing, from Jiri Benc.
11) Allow packets to be captured on macvtap devices, from Vlad Yasevich.
12) Support tunneling in GRO layer, from Jerry Chu.
13) Allow bonding to be configured fully using netlink, from Scott
Feldman.
14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can
already get the TCI. From Atzm Watanabe.
15) New "Heavy Hitter" qdisc, from Terry Lam.
16) Significantly improve the IPSEC support in pktgen, from Fan Du.
17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom
Herbert.
18) Add Proportional Integral Enhanced packet scheduler, from Vijay
Subramanian.
19) Allow openvswitch to mmap'd netlink, from Thomas Graf.
20) Key TCP metrics blobs also by source address, not just destination
address. From Christoph Paasch.
21) Support 10G in generic phylib. From Andy Fleming.
22) Try to short-circuit GRO flow compares using device provided RX
hash, if provided. From Tom Herbert.
The wireless and netfilter folks have been busy little bees too.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits)
net/cxgb4: Fix referencing freed adapter
ipv6: reallocate addrconf router for ipv6 address when lo device up
fib_frontend: fix possible NULL pointer dereference
rtnetlink: remove IFLA_BOND_SLAVE definition
rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info
qlcnic: update version to 5.3.55
qlcnic: Enhance logic to calculate msix vectors.
qlcnic: Refactor interrupt coalescing code for all adapters.
qlcnic: Update poll controller code path
qlcnic: Interrupt code cleanup
qlcnic: Enhance Tx timeout debugging.
qlcnic: Use bool for rx_mac_learn.
bonding: fix u64 division
rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC
sfc: Use the correct maximum TX DMA ring size for SFC9100
Add Shradha Shah as the sfc driver maintainer.
net/vxlan: Share RX skb de-marking and checksum checks with ovs
tulip: cleanup by using ARRAY_SIZE()
ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
net/cxgb4: Don't retrieve stats during recovery
...
We're going to be adding a few new barrier primitives, and in order to
avoid endless duplication make more agressive use of
asm-generic/barrier.h.
Change the asm-generic/barrier.h such that it allows partial barrier
definitions and fills out the rest with defaults.
There are a few architectures (m32r, m68k) that could probably
do away with their barrier.h file entirely but are kept for now due to
their unconventional nop() implementation.
Suggested-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Ellerman <michael@ellerman.id.au>
Cc: Michael Neuling <mikey@neuling.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Link: http://lkml.kernel.org/r/20131213150640.846368594@infradead.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
It turns out the kernel relies on barrier() to force a reload of the
percpu offset value. Since we can't easily modify the definition of
barrier() to include "tp" as an output register, we instead provide a
definition of __my_cpu_offset as extended assembly that includes a fake
stack read to hazard against barrier(), forcing gcc to know that it
must reread "tp" and recompute anything based on "tp" after a barrier.
This fixes observed hangs in the slub allocator when we are looping
on a percpu cmpxchg_double.
A similar fix for ARMv7 was made in June in change 509eb76ebf.
Cc: stable@vger.kernel.org
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
atomic* value is signed value, and atomic* functions need also process
signed value (parameter value, and return value), so use 'long long'
instead of 'u64'.
After replacement, it will also fix a bug for atomic64_add_negative():
"u64 is never less than 0".
The modifications are:
in vim, use "1,% s/\<u64\>/long long/g" command.
remove redundant '__aligned(8)'.
be sure of 80 (and macro '\') columns limitation after replacement.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [re-instated const cast]
In order to prepare to per-arch implementations of preempt_count move
the required bits into an asm-generic header and use this for all
archs.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-h5j0c1r3e3fk015m30h8f1zx@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The hardware architecture descriptor headers have been updated, in
particular to reflect some larger MMIO fields on the mPIPE shims for
controlling the network hardware, from the recent Gx72 release.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
A config option to allow a variant vmap() using huge pages that was never
upstreamed had some bits of code related to it scattered around the tile
architecture; the config option was removed downstream and this commit
cleans up the scattered evidence of it from the upstream as well.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Pull Tile arch updates from Chris Metcalf:
"These changes bring in a bunch of new functionality that has been
maintained internally at Tilera over the last year, plus other stray
bits of work that I've taken into the tile tree from other folks.
The changes include some PCI root complex work, interrupt-driven
console support, support for performing fast-path unaligned data
fixups by kernel-based JIT code generation, CONFIG_PREEMPT support,
vDSO support for gettimeofday(), a serial driver for the tilegx
on-chip UART, KGDB support, more optimized string routines, support
for ftrace and kprobes, improved ASLR, and many bug fixes.
We also remove support for the old TILE64 chip, which is no longer
buildable"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile: (85 commits)
tile: refresh tile defconfig files
tile: rework <asm/cmpxchg.h>
tile PCI RC: make default consistent DMA mask 32-bit
tile: add null check for kzalloc in tile/kernel/setup.c
tile: make __write_once a synonym for __read_mostly
tile: remove support for TILE64
tile: use asm-generic/bitops/builtin-*.h
tile: eliminate no-op "noatomichash" boot argument
tile: use standard tile_bundle_bits type in traps.c
tile: simplify code referencing hypervisor API addresses
tile: change <asm/system.h> to <asm/switch_to.h> in comments
tile: mark pcibios_init() as __init
tile: check for correct compiler earlier in asm-offsets.c
tile: use standard 'generic-y' model for <asm/hw_irq.h>
tile: use asm-generic version of <asm/local64.h>
tile PCI RC: add comment about "PCI hole" problem
tile: remove DEBUG_EXTRA_FLAGS kernel config option
tile: add virt_to_kpte() API and clean up and document behavior
tile: support FRAME_POINTER
tile: support reporting Tilera hypervisor statistics
...
The macrology in cmpxchg.h was designed to allow arbitrary pointer
and integer values to be passed through the routines. To support
cmpxchg() on 64-bit values on the 32-bit tilepro architecture, we
used the idiom "(typeof(val))(typeof(val-val))". This way, in the
"size 8" branch of the switch, when the underlying cmpxchg routine
returns a 64-bit quantity, we cast it first to a typeof(val-val)
quantity (i.e. size_t if "val" is a pointer) with no warnings about
casting between pointers and integers of different sizes, then cast
onwards to typeof(val), again with no warnings. If val is not a
pointer type, the additional cast is a no-op. We can't replace the
typeof(val-val) cast with (for example) unsigned long, since then if
"val" is really a 64-bit type, we cast away the high bits.
HOWEVER, this fails with current gcc (through 4.7 at least) if "val"
is a pointer to an incomplete type. Unfortunately gcc isn't smart
enough to realize that "val - val" will always be a size_t type
even if it's an incomplete type pointer.
Accordingly, I've reworked the way we handle the casting. We have
given up the ability to use cmpxchg() on 64-bit values on tilepro,
which is OK in the kernel since we should use cmpxchg64() explicitly
on such values anyway. As a result, I can just use simple "unsigned
long" casts internally.
As I reworked it, I realized it would be cleaner to move the
architecture-specific conditionals for cmpxchg and xchg out of the
atomic.h headers and into cmpxchg, and then use the cmpxchg() and
xchg() primitives directly in atomic.h and elsewhere. This allowed
the cmpxchg.h header to stand on its own without relying on the
implicit include of it that is performed by <asm/atomic.h>.
It also allowed collapsing the atomic_xchg/atomic_cmpxchg routines
from atomic_{32,64}.h into atomic.h.
I improved the tests that guard the allowed size of the arguments
to the routines to use a __compiletime_error() test. (By avoiding
the use of BUILD_BUG, I could include cmpxchg.h into bitops.h as
well and use the macros there, which is otherwise impossible due
to include order dependency issues.)
The tilepro _atomic_xxx internal methods were previously set up to
take atomic_t and atomic64_t arguments, which isn't as convenient
with the new model, so I modified them to take int or u64 arguments,
which is consistent with how they used the arguments internally
anyway, so provided some nice simplification there too.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Pull networking changes from David Miller:
"Noteworthy changes this time around:
1) Multicast rejoin support for team driver, from Jiri Pirko.
2) Centralize and simplify TCP RTT measurement handling in order to
reduce the impact of bad RTO seeding from SYN/ACKs. Also, when
both timestamps and local RTT measurements are available prefer
the later because there are broken middleware devices which
scramble the timestamp.
From Yuchung Cheng.
3) Add TCP_NOTSENT_LOWAT socket option to limit the amount of kernel
memory consumed to queue up unsend user data. From Eric Dumazet.
4) Add a "physical port ID" abstraction for network devices, from
Jiri Pirko.
5) Add a "suppress" operation to influence fib_rules lookups, from
Stefan Tomanek.
6) Add a networking development FAQ, from Paul Gortmaker.
7) Extend the information provided by tcp_probe and add ipv6 support,
from Daniel Borkmann.
8) Use RCU locking more extensively in openvswitch data paths, from
Pravin B Shelar.
9) Add SCTP support to openvswitch, from Joe Stringer.
10) Add EF10 chip support to SFC driver, from Ben Hutchings.
11) Add new SYNPROXY netfilter target, from Patrick McHardy.
12) Compute a rate approximation for sending in TCP sockets, and use
this to more intelligently coalesce TSO frames. Furthermore, add
a new packet scheduler which takes advantage of this estimate when
available. From Eric Dumazet.
13) Allow AF_PACKET fanouts with random selection, from Daniel
Borkmann.
14) Add ipv6 support to vxlan driver, from Cong Wang"
Resolved conflicts as per discussion.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1218 commits)
openvswitch: Fix alignment of struct sw_flow_key.
netfilter: Fix build errors with xt_socket.c
tcp: Add missing braces to do_tcp_setsockopt
caif: Add missing braces to multiline if in cfctrl_linkup_request
bnx2x: Add missing braces in bnx2x:bnx2x_link_initialize
vxlan: Fix kernel panic on device delete.
net: mvneta: implement ->ndo_do_ioctl() to support PHY ioctls
net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
icplus: Use netif_running to determine device state
ethernet/arc/arc_emac: Fix huge delays in large file copies
tuntap: orphan frags before trying to set tx timestamp
tuntap: purge socket error queue on detach
qlcnic: use standard NAPI weights
ipv6:introduce function to find route for redirect
bnx2x: VF RSS support - VF side
bnx2x: VF RSS support - PF side
vxlan: Notify drivers for listening UDP port changes
net: usbnet: update addr_assign_type if appropriate
driver/net: enic: update enic maintainers and driver
driver/net: enic: Exposing symbols for Cisco's low latency driver
...
This change sets the PCI devices' initial DMA capabilities
conservatively and promotes them at the request of the driver,
as opposed to assuming advanced DMA capabilities. The old design
runs the risk of breaking drivers that assume default capabilities.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This was really only useful for TILE64 when we mapped the
kernel data with small pages. Now we use a huge page and we
really don't want to map different parts of the kernel
data in different ways.
We retain the __write_once name in case we want to bring
it back to life at some point in the future.
Note that this change uncovered a latent bug where the
"smp_topology" variable happened to always be aligned mod 8
so we could store two "int" values at once, but when we
eliminated __write_once it ended up only aligned mod 4.
Fix with an explicit annotation.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This chip is no longer being actively developed for (it was superceded
by the TILEPro64 in 2008), and in any case the existing compiler and
toolchain in the community do not support it. It's unlikely that the
kernel works with TILE64 at this point as the configuration has not been
tested in years. The support is also awkward as it requires maintaining
a significant number of ifdefs. So, just remove it altogether.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The definisions of __ffs(), __fls(), and ffs() for tile are almost same
as asm-generic/bitops-*.h. The only difference is that it is defined
as __always_inline or inline. So this switches to use those headers.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [moved #includes to end]
We use virt_to_pte(NULL, va) a lot, which isn't very obvious.
I added virt_to_kpte(va) as a more obvious wrapper function,
that also validates the va as being a kernel adddress.
And, I fixed the semantics of virt_to_pte() so that we handle
the pud and pmd the same way, and we now document the fact that
we handle the final pte level differently.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Newer hypervisors have an API for reporting per-cpu statistics
information. This change allows seeing that information via
/sys/devices/system/cpu/cpuN/hv_stats file for each core.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Enter kernel debugger at boot with:
--hvd UART_1=1 --hvx kgdbwait --hvx kgdboc=ttyS1,115200
or at runtime with:
echo ttyS1,115200 > /sys/module/kgdboc/parameters/kgdboc
echo g > /proc/sysrq-trigger
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The TILE-Gx chip includes an on-chip UART. This change adds support
for using the UART from within the kernel. The UART shim has more
functionality than is exposed here, but to keep the kernel code and
binary simpler, this is a subset of the full API designed to enable
a standard Linux tty serial driver only.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The existing code relied on the hardware definition (<arch/chip.h>)
to specify how much VA and PA space was available. It's convenient
to allow customizing this for some configurations, so provide symbols
MAX_PA_WIDTH and MAX_VA_WIDTH in <asm/page.h> that can be modified
if desired.
Additionally, move away from the MEM_XX_INTRPT nomenclature to
define the start of various regions within the VA space. In fact
the cleaner symbol is, for example, MEM_SV_START, to indicate the
start of the area used for supervisor code; the actual address of the
interrupt vectors is not as important, and can be changed if desired.
As part of this change, convert from "intrpt1" nomenclature (which
built in the old privilege-level 1 model) to a simple "intrpt".
Also strip out some tilepro-specific code supporting modifying the
PL the kernel could run at, since we don't actually support using
different PLs in tilepro, only tilegx.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Technically, user privilege is anything less than kernel
privilege. We modify the existing user_mode() macro to have
this semantic (and use it in a couple of places it wasn't being
used before), and add an IS_KERNEL_EX1() macro to the assembly
code as well.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This tile-specific API had a minor bug, in that if a super huge (>4GB)
page mapped a particular address range, we wouldn't handle it correctly.
As part of fixing that bug, I also cleaned up some of the pud and pmd
accessors to make them more consistent.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Previously, we used a special-purpose register (SPR_SYSTEM_SAVE_K_0)
to hold the CPU number and the top of the current kernel stack
by using the low bits to hold the CPU number, and using the high
bits to hold the address of the page just above where we'd want
the kernel stack to be. That way we could initialize a new SP
when first entering the kernel by just masking the SPR value and
subtracting a couple of words.
However, it's actually more useful to be able to place an arbitrary
kernel-top value in the SPR. This allows us to create a new stack
context (e.g. for virtualization) with an arbitrary top-of-stack VA.
To make this work, we now store the CPU number in the high bits,
above the highest legal VA bit (42 bits in the current tilegx
microarchitecture). The full 42 bits are thus available to store the
top of stack value. Getting the current cpu (a relatively common
operation) is still fast; it's now a shift rather than a mask.
We make this change only for tilegx, since tilepro has too few SPR
bits to do this, and we don't need this support on tilepro anyway.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Normally the build doesn't include these warnings, but at one
point I built with -Wsign-compare, and noticed a few things that
are technically bugs. This change fixes those things.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Nothing in the codebase was using them, and as written they took
"unsigned long" as the physical address rather than "phys_addr_t",
which is wrong on tilepro anyway. Rather than fixing stale APIs,
just remove them; if there's ever demand for them on this platform,
we can put them back.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
With this change, tile Linux now supports address-space layout
randomization for shared objects, stack, heap and vdso.
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Tony Lu <zlu@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This may fix a reported bug where an R_TILEGX_64 in a module was not
pointing to an aligned address.
Reported-by: Simon Marchi <simon.marchi@polymtl.ca>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change includes support for Kprobes, Jprobes and Return Probes.
Reviewed-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Tony Lu <zlu@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This commit adds support for static ftrace, graph function support,
and dynamic tracer support.
Signed-off-by: Tony Lu <zlu@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change creates the framework for vDSO calls, makes the existing
rt_sigreturn() mechanism use it, and adds a fast gettimeofday().
Now that we need to expose the vDSO address to userspace, we add
AT_SYSINFO_EHDR to the set of aux entries provided to userspace.
(You can disable any extra vDSO support by booting with vdso=0,
but the rt_sigreturn vDSO page will still be provided.)
Note that glibc has supported the tile vDSO since release 2.17.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
First, fix a bug in asm/unaligned.h; we need to just use the asm-generic
unaligned.h so we properly choose endian-correct flavors.
Second, keep the hv/hypervisor.h ABI fully "native" in the sense that
we don't have __BIG_ENDIAN__ ifdefs there. Instead, we use macros in
the head_NN.S assembly code to properly extract two 32-bit structure
members from a 64-bit register holding the structure.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change adds support for CONFIG_PREEMPT (full kernel preemption).
In addition to the core support, this change includes a number
of places where we fix up uses of smp_processor_id() and per-cpu
variables. I also eliminate the PAGE_HOME_HERE and PAGE_HOME_UNKNOWN
values for page homing, as it turns out they weren't being used.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change adds support for avoiding recursive backtracer crashes;
we haven't seen this in practice other than when things are seriously
corrupt, but it may help avoid losing the root cause of a crash.
Also, don't abort kernel backtracers for invalid userspace PC's.
If we do, we lose the ability to backtrace through a userspace
call to a bad address above PAGE_OFFSET, even though that it can
be perfectly reasonable to continue the backtrace in such a case.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change enables unaligned userspace memory access via a kernel
fast path on tilegx. The kernel tracks user PC/instruction pairs
per-thread using a direct-mapped cache in userspace. The cache
maps those PC/instruction pairs to JIT'ed instruction sequences that
load or store using byte-wide load store intructions and then
synthesize 2-, 4- or 8-byte load or store results. Once an
instruction has been seen to generate an unaligned access once,
subsequent hits on that instruction typically require overhead
of only around 50 cycles if cache and TLB is hot.
We support the prctl() PR_GET_UNALIGN / PR_SET_UNALIGN sys call to
enable or disable unaligned fixups on a per-process basis.
To do this we pull some of the tilepro unaligned support out of the
single_step.c file; tilepro uses instruction disassembly for both
single-step and unaligned access support. Since tilegx actually has
hardware singlestep support, though, it's cleaner to keep the tilegx
unaligned access code in a separate file. While we're at it,
properly rename the tilepro-specific types, etc., to have tilepro
suffixes instead of generic tile suffixes.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
This change improves and cleans up the tile console.
- We enable HVC_IRQ support on tilegx, with the addition of a new
Tilera hypervisor API for tilegx to allow a console IPI. If IPI
support is not available we fall back to the previous polling mode.
- We simplify the earlyprintk code to use CON_BOOT and eliminate some
of the other supporting earlyprintk code.
- A new tile_console_write() primitive is used to send output to
the console and is factored out of the hvc_tile driver.
This lets us support a "sim_console" boot argument to allow using
simulator hooks to send output to the "console" as a slightly
faster alternative to emulating the hardware more directly.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
On Tilera Gx72 systems, the logic for figuring out whether
a given port is root complex is slightly different.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The standard kernel function dma_get_required_mask() uses the
highest DRAM address to determine if 32-bit or 64-bit DMA addressing
is needed. This only works on architectures that have direct mapping
between the PA and the PCI address space, i.e. those that don't have
I/O TLBs or have I/O TLB but choose to use direct mapping. Neither
of these are true for tilegx. Whether to use 64-bit DMA should depend
on the PCI device's capability only, not on the amount of DRAM
installeds, so we now advertise a 64-bit DMA mask unconditionally.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The .mem_resources[] field in the pci_controller struct
is now obsoleted by the .mem_space and .io_space fields.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The TRIO shim initialization is shared with other kernel drivers
such as the endpoint and StreamIO drivers, so reorganize the
initialization flow to ensure that the root complex driver properly
initializes TRIO state regardless of what kind of TRIO driver will
end up using the shim.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
To enable this functionality, configure CONFIG_TILE_PCI_IO. Without
this flag, the kernel still assigns I/O address ranges to the
devices, but no TRIO resource and mapping support is provided.
We assign disjoint I/O address ranges to separate PCIe domains.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Besides using pr_info() to print the linkdown status for a plug-in
slot, add extra indication that this is expected if the slot is empty.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
To support PCIe devices with higher number of MSI-X interrupt vectors,
e.g. 16 for the LSI RAID card, enhance the Gx RC stack to provide more
MSI-X vectors by using the TRIO Scatter Queues, which provide 8 more
vectors in addition to ~10 from the Map Mem regions.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
The LSI MEGARAID SAS HBA suffers from the problem where it can do
64-bit DMA to streaming buffers but not to consistent buffers.
In other words, 64-bit DMA is used for disk data transfers and 32-bit
DMA must be used for control message transfers. According to LSI,
the firmware is not fully functional yet. This change implements a
kind of hybrid dma_ops to support this.
Note that on most other platforms, the 64-bit DMA addressing space is the
same as the 32-bit DMA space and they overlap the physical memory space.
No special arrangement is needed to support this kind of mixed DMA
capability. On TILE-Gx, the 64-bit DMA space is completely separate
from the 32-bit DMA space. Due to the use of the IOMMU, the 64-bit DMA
space doesn't overlap the physical memory space. On the other hand,
the 32-bit DMA space overlaps the physical memory space under 4GB.
The separate address spaces make it necessary to have separate dma_ops.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
- remove unneeded <linux/bootmem.h> include in pci.c
- eliminate unused pci_controller.first_busno field
- prefer msleep to mdelay
- remove stale comment about pci_scan_bus_parented()
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Using strlen as a model, add length checking to create strnlen.
Signed-off-by: Ken Steele <ken@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The initial driver support was for a single mPIPE shim on the chip
(as is the case for the Gx36 hardware). The Gx72 chip has two mPIPE
shims, so we extend the driver to handle that case.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The "inv" (invalidate) instruction is generally less safe than "finv"
(flush and invalidate), as it will drop dirty data from the cache.
It turns out we have almost no need for "inv" (other than for the
older 32-bit architecture in some limited cases), so convert to
"finv" where possible and delete the extra "inv" infrastructure.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Need add cmpxchg64(), or will cause compiling issue.
Need define it as cmpxchg() only for 64-bit operation, since cmpxchg()
can support 8 bytes.
The related error (with allmodconfig):
drivers/block/blockconsole.c: In function ‘bcon_advance_console_bytes’:
drivers/block/blockconsole.c:164:2: error: implicit declaration of function ‘cmpxchg64’ [-Werror=implicit-function-declaration]
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Normalize global variables exported by vmlinux.lds to conform usage
guidelines from include/asm-generic/sections.h.
1) Use _text to mark the start of the kernel image including the head
text, and _stext to mark the start of the .text section.
2) Export mandatory global variables __init_begin and __init_end.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull voluntary preemption fixes from Ingo Molnar:
"This tree contains a speedup which is achieved through better
might_sleep()/might_fault() preemption point annotations for uaccess
functions, by Michael S Tsirkin:
1. The only reason uaccess routines might sleep is if they fault.
Make this explicit for all architectures.
2. A voluntary preemption point in uaccess functions means compiler
can't inline them efficiently, this breaks assumptions that they
are very fast and small that e.g. net code seems to make. Remove
this preemption point so behaviour matches with what callers
assume.
3. Accesses (e.g through socket ops) to kernel memory with KERNEL_DS
like net/sunrpc does will never sleep. Remove an unconditinal
might_sleep() in the might_fault() inline in kernel.h (used when
PROVE_LOCKING is not set).
4. Accesses with pagefault_disable() return EFAULT but won't cause
caller to sleep. Check for that and thus avoid might_sleep() when
PROVE_LOCKING is set.
These changes offer a nice speedup for CONFIG_PREEMPT_VOLUNTARY=y
kernels, here's a network bandwidth measurement between a virtual
machine and the host:
before:
incoming: 7122.77 Mb/s
outgoing: 8480.37 Mb/s
after:
incoming: 8619.24 Mb/s [ +21.0% ]
outgoing: 9455.42 Mb/s [ +11.5% ]
I kept these changes in a separate tree, separate from scheduler
changes, because it's a mixed MM and scheduler topic"
* 'sched-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
mm, sched: Allow uaccess in atomic with pagefault_disable()
mm, sched: Drop voluntary schedule from might_fault()
x86: uaccess s/might_sleep/might_fault/
tile: uaccess s/might_sleep/might_fault/
powerpc: uaccess s/might_sleep/might_fault/
mn10300: uaccess s/might_sleep/might_fault/
microblaze: uaccess s/might_sleep/might_fault/
m32r: uaccess s/might_sleep/might_fault/
frv: uaccess s/might_sleep/might_fault/
arm64: uaccess s/might_sleep/might_fault/
asm-generic: uaccess s/might_sleep/might_fault/
Pull scheduler updates from Ingo Molnar:
"The main changes:
- load-calculation cleanups and improvements, by Alex Shi
- various nohz related tidying up of statisics, by Frederic
Weisbecker
- factor out /proc functions to kernel/sched/proc.c, by Paul
Gortmaker
- simplify the RT policy scheduler, by Kirill Tkhai
- various fixes and cleanups"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
sched/debug: Remove CONFIG_FAIR_GROUP_SCHED mask
sched/debug: Fix formatting of /proc/<PID>/sched
sched: Fix typo in struct sched_avg member description
sched/fair: Fix typo describing flags in enqueue_entity
sched/debug: Add load-tracking statistics to task
sched: Change get_rq_runnable_load() to static and inline
sched/tg: Remove tg.load_weight
sched/cfs_rq: Change atomic64_t removed_load to atomic_long_t
sched/tg: Use 'unsigned long' for load variable in task group
sched: Change cfs_rq load avg to unsigned long
sched: Consider runnable load average in move_tasks()
sched: Compute runnable load avg in cpu_load and cpu_avg_load_per_task
sched: Update cpu load after task_tick
sched: Fix sleep time double accounting in enqueue entity
sched: Set an initial value of runnable avg for new forked task
sched: Move a few runnable tg variables into CONFIG_SMP
Revert "sched: Introduce temporary FAIR_GROUP_SCHED dependency for load-tracking"
sched: Don't mix use of typedef ctl_table and struct ctl_table
sched: Remove WARN_ON(!sd) from init_sched_groups_power()
sched: Fix memory leakage in build_sched_groups()
...
Most of the stuff from kernel/sched.c was moved to kernel/sched/core.c long time
back and the comments/Documentation never got updated.
I figured it out when I was going through sched-domains.txt and so thought of
fixing it globally.
I haven't crossed check if the stuff that is referenced in sched/core.c by all
these files is still present and hasn't changed as that wasn't the motive behind
this patch.
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/cdff76a265326ab8d71922a1db5be599f20aad45.1370329560.git.viresh.kumar@linaro.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The only reason uaccess routines might sleep
is if they fault. Make this explicit.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1369577426-26721-8-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull tile update from Chris Metcalf:
"The interesting bug fix is support for the upcoming "4.2" release of
the Tilera hypervisor, which by default launches Linux at privilege
level 2 instead of 1. The fix lets new and old hypervisors and
Linuxes interoperate more smoothly, so I've tagged it for
stable@kernel.org so that older Linuxes will be able to boot under the
newer hypervisor."
* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
usb: tilegx: fix memleak when create hcd fail
arch/tile: remove inline marking of EXPORT_SYMBOL functions
rtc: rtc-tile: add missing platform_device_unregister() when module exit
tile: support new Tilera hypervisor
The Tilera hypervisor shipped in releases up through MDE 4.1 launches
the client operating system (i.e. Linux) at privilege level 1 (PL1).
Starting with MDE 4.2, as part of the work to enable KVM, the
Tilera hypervisor launches Linux at PL2 instead.
This commit makes the KERNEL_PL option default to 2 for tilegx, while
still saying at 1 for tilepro, which doesn't have an updated hypervisor.
It also explains how and when you might want to choose another value.
In addition, we change a small buglet in the on-chip Ethernet driver,
where we were failing to use the KERNEL_PL constant in an API call.
To make the transition cleaner, this change also provides the updated
hv_init() API for the new hypervisor that supports announcing Linux's
compiled-in PL, so the hypervisor can generate a suitable error in the
case of a mismatched hypervisor and Linux binary.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: stable@vger.linux.org
Pull tile arch changes from Chris Metcalf:
"These are some minor new feature work and other changes that didn't
merit getting pushed up after the 3.9 merge window closed.
There should be a lot more activity in the 3.11 merge window"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
arch/tile: Fix syscall return value passed to tracepoint
tile: comment assumption about __insn_mtspr for <asm/irqflags.h>
tile: ns2cycles should use __raw_get_cpu_var
arch: remove KCORE_ELF again [tile]
tile: remove two outdated Kconfig entries
tile: support atomic64_dec_if_positive()
tile: support TIF_SYSCALL_TRACEPOINT; select HAVE_SYSCALL_TRACEPOINTS
tile: Add definition of NR_syscalls
tile: move declaration of sys_call_table to <asm/syscall.h>
arch/tile: Enable HAVE_ARCH_TRACEHOOK
arch/tile: Call tracehook_report_syscall_{entry,exit} in syscall trace
The help text for this config is duplicated across the x86, parisc, and
s390 Kconfig.debug files. Arnd Bergman noted that the help text was
slightly misleading and should be fixed to state that enabling this
option isn't a problem when using pre 4.4 gcc.
To simplify the rewording, consolidate the text into lib/Kconfig.debug
and modify it there to be more explicit about when you should say N to
this config.
Also, make the text a bit more generic by stating that this option
enables compile time checks so we can cover architectures which emit
warnings vs. ones which emit errors. The details of how an
architecture decided to implement the checks isn't as important as the
concept of compile time checking of copy_from_user() calls.
While we're doing this, remove all the copy_from_user_overflow() code
that's duplicated many times and place it into lib/ so that any
architecture supporting this option can get the function for free.
Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull SMP/hotplug changes from Ingo Molnar:
"This is a pretty large, multi-arch series unifying and generalizing
the various disjunct pieces of idle routines that architectures have
historically copied from each other and have grown in random, wildly
inconsistent and sometimes buggy directions:
101 files changed, 455 insertions(+), 1328 deletions(-)
this went through a number of review and test iterations before it was
committed, it was tested on various architectures, was exposed to
linux-next for quite some time - nevertheless it might cause problems
on architectures that don't read the mailing lists and don't regularly
test linux-next.
This cat herding excercise was motivated by the -rt kernel, and was
brought to you by Thomas "the Whip" Gleixner."
* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits)
idle: Remove GENERIC_IDLE_LOOP config switch
um: Use generic idle loop
ia64: Make sure interrupts enabled when we "safe_halt()"
sparc: Use generic idle loop
idle: Remove unused ARCH_HAS_DEFAULT_IDLE
bfin: Fix typo in arch_cpu_idle()
xtensa: Use generic idle loop
x86: Use generic idle loop
unicore: Use generic idle loop
tile: Use generic idle loop
tile: Enter idle with preemption disabled
sh: Use generic idle loop
score: Use generic idle loop
s390: Use generic idle loop
powerpc: Use generic idle loop
parisc: Use generic idle loop
openrisc: Use generic idle loop
mn10300: Use generic idle loop
mips: Use generic idle loop
microblaze: Use generic idle loop
...
Commit abf09bed3c ("s390/mm: implement software dirty bits")
introduced another difference in the pte layout vs. the pmd layout on
s390, thoroughly breaking the s390 support for hugetlbfs. This requires
replacing some more pte_xxx functions in mm/hugetlbfs.c with a
huge_pte_xxx version.
This patch introduces those huge_pte_xxx functions and their generic
implementation in asm-generic/hugetlb.h, which will now be included on
all architectures supporting hugetlbfs apart from s390. This change
will be a no-op for those architectures.
[akpm@linux-foundation.org: fix warning]
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Hugh Dickins <hughd@google.com>
Cc: Hillf Danton <dhillf@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.cz> [for !s390 parts]
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The arch_local_irq_save(), etc., routines are required to function
as compiler barriers. They do, but it's subtle and requires knowing
that the gcc builtin __insn_mtspr() is marked as a memory clobber.
Provide a comment explaining the assumption.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
[ This came about from me wondering about the synchronization rules of
__insn_mtspr() - Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The arch_local_irq_save(), etc., routines are required to function
as compiler barriers. They do, but it's subtle and requires knowing
that the gcc builtin __insn_mtspr() is marked as a memory clobber.
Provide a comment explaining the assumption.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Move it to a common place. Preparatory patch for implementing
set/clear for the idle need_resched poll implementation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Cc: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Cc: Magnus Damm <magnus.damm@gmail.com>
Link: http://lkml.kernel.org/r/20130321215233.446034505@linutronix.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
This patch adds support for the TIF_SYSCALL_TRACEPOINT on the tile
architecture. Basically, it calls the appropriate tracepoints on syscall
entry and exit.
Signed-off-by: Simon Marchi <simon.marchi@polymtl.ca>
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>