It is still necessary to handle icache coherency in flush_cache_range()
and copy_to_user_page() when the icache fills from the dcache, even
though the dcache does not need to be written back. However when this
handling was added in commit 2eaa7ec286 ("[MIPS] Handle I-cache
coherency in flush_cache_range()"), it did not do any icache flushing
when it fills from dcache.
Therefore fix r4k_flush_cache_range() to run
local_r4k_flush_cache_range() without taking into account whether icache
fills from dcache, so that the icache coherency gets handled. Checks are
also added in local_r4k_flush_cache_range() so that the dcache blast
doesn't take place when icache fills from dcache.
A test to mmap a page PROT_READ|PROT_WRITE, modify code in it, and
mprotect it to VM_READ|VM_EXEC (similar to case described in above
commit) can hit this case quite easily to verify the fix.
A similar check was added in commit f8829caee3 ("[MIPS] Fix aliasing
bug in copy_to_user_page / copy_from_user_page"), so also fix
copy_to_user_page() similarly, to call flush_cache_page() without taking
into account whether icache fills from dcache, since flush_cache_page()
already takes that into account to avoid performing a dcache flush.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Cc: Manuel Lauss <manuel.lauss@gmail.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/12179/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Let's define page_mapped() to be true for compound pages if any
sub-pages of the compound page is mapped (with PMD or PTE).
On other hand page_mapcount() return mapcount for this particular small
page.
This will make cases like page_get_anon_vma() behave correctly once we
allow huge pages to be mapped with PTE.
Most users outside core-mm should use page_mapcount() instead of
page_mapped().
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Tested-by: Sasha Levin <sasha.levin@oracle.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Acked-by: Jerome Marchand <jmarchan@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Steve Capper <steve.capper@linaro.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
MAARs should be initialised on each CPU (or rather, core) in the system
in order to achieve consistent behaviour & performance. Previously they
have only been initialised on the boot CPU which leads to performance
problems if tasks are later scheduled on a secondary CPU, particularly
if those tasks make use of unaligned vector accesses where some CPUs
don't handle any cases in hardware for non-speculative memory regions.
Fix this by recording the MAAR configuration from the boot CPU and
applying it to secondary CPUs as part of their bringup.
Reported-by: Doug Gilmore <doug.gilmore@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: Andrew Bresticker <abrestic@chromium.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: James Hogan <james.hogan@imgtec.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: Hemmo Nieminen <hemmo.nieminen@iki.fi>
Cc: Alex Smith <alex.smith@imgtec.com>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Patchwork: https://patchwork.linux-mips.org/patch/11239/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Verifying that the MAAR configuration is as expected is useful when
debugging the performance of a system. Print out the memory regions
configured via MAAR along with their attributes.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Patchwork: https://patchwork.linux-mips.org/patch/11238/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
maar_init was previously only compiled when CONFIG_NEED_MULTIPLE_NODES
was not set, which has been fine since it is only called from the
standard implementation of mem_init which has the same condition. In
preparation for calling it from the SMP startup code on secondary CPUs,
move maar_init outside of the #ifndef such that it is always compiled.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: linux-kernel@vger.kernel.org
Cc: Ingo Molnar <mingo@kernel.org>
Patchwork: https://patchwork.linux-mips.org/patch/11237/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Introduce a default weak implementation of platform_maar_init which
makes use of the data that platforms already provide to the bootmem
allocator. This should hopefully cover the most common configurations,
reduce the duplication of information provided by platforms & leaves
platforms with the option of providing a custom implementation if
required.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-kernel@vger.kernel.org
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Patchwork: https://patchwork.linux-mips.org/patch/10676/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Add support for extended physical addressing (XPA) so that
32-bit platforms can access equal to or greater than 40 bits
of physical addresses.
NOTE:
1) XPA and EVA are not the same and cannot be used
simultaneously.
2) If you configure your kernel for XPA, the PTEs
and all address sizes become 64-bit.
3) Your platform MUST have working HIGHMEM support.
Signed-off-by: Steven J. Hill <Steven.Hill@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/9355/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
In order to make the static inline function is_zero_pfn() callable by
modules, export its symbol dependencies 'zero_pfn' and (for s390 and
mips) 'zero_page_mask'.
We need this for KVM, as CONFIG_KVM is a tristate for all supported
architectures except ARM and arm64, and testing a pfn whether it refers
to the zero page is required to correctly distinguish the zero page
from other special RAM ranges that may also have the PG_reserved bit
set, but need to be treated as MMIO memory.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add initialisation for Memory Accessibility Attribute Registers. Generic
code cannot know the platform-specific requirements with regards to
speculative accesses, so it simply calls a platform_maar_init function
which platforms with MAARs are expected to implement by calling the
provided write_maar_pair function & returning the number of MAAR pairs
used. A weak default implementation will simply use no MAAR pairs. Any
present but unused MAAR pairs are then marked invalid, effectively
disabling them.
The end result of this patch is that MAARs are all marked invalid, until
platforms implement the platform_maar_init function.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/7331/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This is identical to kmap_coherent apart from the cache coherency
attribute used for the TLB entry, so kmap_coherent is abstracted to
kmap_prot which is then called for both kmap_coherent &
kmap_noncoherent. This will be used by a subsequent patch.
Suggested-by: Leonid Yegoshin <leonid.yegoshin@imgtec.com>
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Nobody is maintaining SMTC anymore and there also seems to be no userbase.
Which is a pity - the SMTC technology primarily developed by Kevin D.
Kissell <kevink@paralogos.com> is an ingenious demonstration for the MT
ASE's power and elegance.
Based on Markos Chandras <Markos.Chandras@imgtec.com> patch
https://patchwork.linux-mips.org/patch/6719/ which while very similar did
no longer apply cleanly when I tried to merge it plus some additional
post-SMTC cleanup - SMTC was a feature as tricky to remove as it was to
merge once upon a time.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
A core in EVA mode can have any possible segment mapping, so the
default free_initmem_default() function may not always work as expected.
Therefore, add a callback that platforms can use to free up the init section.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
The UNIQUE_ENTRYHI definition was duplicated whenever there
was the need to flush the TLB entries. We move this common
definition to a header file.
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Signed-off-by: John Crispin <blogic@openwrt.org>
Patchwork: http://patchwork.linux-mips.org/patch/6129/
Rewrite the preempt_count macros in order to extract the 3 basic
preempt_count value modifiers:
__preempt_count_add()
__preempt_count_sub()
and the new:
__preempt_count_dec_and_test()
And since we're at it anyway, replace the unconventional
$op_preempt_count names with the more conventional preempt_count_$op.
Since these basic operators are equivalent to the previous _notrace()
variants, do away with the _notrace() versions.
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/n/tip-ewbpdbupy9xpsjhg960zwbv8@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Concentrate code to modify totalram_pages into the mm core, so the arch
memory initialized code doesn't need to take care of it. With these
changes applied, only following functions from mm core modify global
variable totalram_pages: free_bootmem_late(), free_all_bootmem(),
free_all_bootmem_node(), adjust_managed_page_count().
With this patch applied, it will be much more easier for us to keep
totalram_pages and zone->managed_pages in consistence.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Acked-by: David Howells <dhowells@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Change signature of free_reserved_area() according to Russell King's
suggestion to fix following build warnings:
arch/arm/mm/init.c: In function 'mem_init':
arch/arm/mm/init.c:603:2: warning: passing argument 1 of 'free_reserved_area' makes integer from pointer without a cast [enabled by default]
free_reserved_area(__va(PHYS_PFN_OFFSET), swapper_pg_dir, 0, NULL);
^
In file included from include/linux/mman.h:4:0,
from arch/arm/mm/init.c:15:
include/linux/mm.h:1301:22: note: expected 'long unsigned int' but argument is of type 'void *'
extern unsigned long free_reserved_area(unsigned long start, unsigned long end,
mm/page_alloc.c: In function 'free_reserved_area':
>> mm/page_alloc.c:5134:3: warning: passing argument 1 of 'virt_to_phys' makes pointer from integer without a cast [enabled by default]
In file included from arch/mips/include/asm/page.h:49:0,
from include/linux/mmzone.h:20,
from include/linux/gfp.h:4,
from include/linux/mm.h:8,
from mm/page_alloc.c:18:
arch/mips/include/asm/io.h:119:29: note: expected 'const volatile void *' but argument is of type 'long unsigned int'
mm/page_alloc.c: In function 'free_area_init_nodes':
mm/page_alloc.c:5030:34: warning: array subscript is below array bounds [-Warray-bounds]
Also address some minor code review comments.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: <sworddragon2@aol.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: Michel Lespinasse <walken@google.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull VFS updates from Al Viro,
Misc cleanups all over the place, mainly wrt /proc interfaces (switch
create_proc_entry to proc_create(), get rid of the deprecated
create_proc_read_entry() in favor of using proc_create_data() and
seq_file etc).
7kloc removed.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
don't bother with deferred freeing of fdtables
proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
proc: Make the PROC_I() and PDE() macros internal to procfs
proc: Supply a function to remove a proc entry by PDE
take cgroup_open() and cpuset_open() to fs/proc/base.c
ppc: Clean up scanlog
ppc: Clean up rtas_flash driver somewhat
hostap: proc: Use remove_proc_subtree()
drm: proc: Use remove_proc_subtree()
drm: proc: Use minor->index to label things, not PDE->name
drm: Constify drm_proc_list[]
zoran: Don't print proc_dir_entry data in debug
reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
proc: Supply an accessor for getting the data from a PDE's parent
airo: Use remove_proc_subtree()
rtl8192u: Don't need to save device proc dir PDE
rtl8187se: Use a dir under /proc/net/r8180/
proc: Add proc_mkdir_data()
proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
proc: Move PDE_NET() to fs/proc/proc_net.c
...
Use helper function free_highmem_page() to free highmem pages into
the buddy system.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Daney <david.daney@cavium.com>
Cc: Cong Wang <amwang@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use common help functions to free reserved pages.
Signed-off-by: Jiang Liu <jiang.liu@huawei.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Having received another series of whitespace patches I decided to do this
once and for all rather than dealing with this kind of patches trickling
in forever.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
We can save an instruction in the TLB Refill path for kernel mappings
by aligning swapper_pg_dir on a 64K boundary. The address of
swapper_pg_dir can be generated with a single LUI instead of
LUI/{D}ADDUI.
The alignment of __init_end is bumped up to 64K so there are no holes
between it and swapper_pg_dir, which is placed at the very beginning
of .bss.
The alignment of invalid_pmd_table and invalid_pte_table can be
relaxed to PAGE_SIZE. We do this by using __page_aligned_bss, which
has the added benefit of eliminating alignment holes in .bss.
Signed-off-by: David Daney <david.daney@cavium.com>
Cc: linux-mips@linux-mips.org
Cc: linux-arch@vger.kernel.org,
Cc: linux-kernel@vger.kernel.org
Acked-by: Arnd Bergmann <arnd@arndb.de>
Patchwork: https://patchwork.linux-mips.org/patch/4220/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
This patch addresses a couple of related problems:
1) The kernel may reside in physical memory outside of the ranges set
by plat_mem_setup(). If this is the case, init mem cannot be
reused as it resides outside of the range of pages that the kernel
memory allocators control.
2) initrd images might be loaded in physical memory outside of the
ranges set by plat_mem_setup(). The memory likewise cannot be
reused. The patch doesn't handle this specific case, but the
infrastructure is useful for future patches that do.
The crux of the problem is that there are memory regions that need be
memory_present(), but that cannot be free_bootmem() at the time of
arch_mem_init(). We create a new type of memory (BOOT_MEM_INIT_RAM)
for use with add_memory_region(). Then arch_mem_init() adds the init
mem with this type if the init mem is not already covered by existing
ranges.
When memory is being freed into the bootmem allocator, we skip the
BOOT_MEM_INIT_RAM ranges so they are not clobbered, but we do signal
them as memory_present(). This way when they are later freed, the
necessary memory manager structures have initialized and the Sparse
allocater is prevented from crashing.
The Octeon specific code that handled this case is removed, because
the new general purpose code handles the case.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/1988/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
fixrange_init() allocates page tables for all addresses higher than
FIXADDR_TOP. On processors that override the default FIXADDR_TOP
address of 0xfffe_0000, this can consume up to 4 pages (1 page per 4MB)
for pgd's that are never used.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/1980/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
pfn_valid() compares the PFN to max_mapnr:
__pfn >= min_low_pfn && __pfn < max_mapnr;
On HIGHMEM kernels, highend_pfn is used to set the value of max_mapnr.
Unfortunately, highend_pfn is left at zero if the system does not
actually have enough RAM to reach into the HIGHMEM range. This causes
pfn_valid() to always return false, and when debug checks are enabled
the kernel will fail catastrophically:
Memory: 22432k/32768k available (2249k kernel code, 10336k reserved, 653k data, 1352k init, 0k highmem)
NR_IRQS:128
kfree_debugcheck: out of range ptr 81c02900h.
Kernel bug detected[#1]:
Cpu 0
$ 0 : 00000000 10008400 00000034 00000000
$ 4 : 8003e160 802a0000 8003e160 00000000
$ 8 : 00000000 0000003e 00000747 00000747
...
On such a configuration, max_low_pfn should be used to set max_mapnr.
This was seen on 2.6.34.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
To: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
Cc: linux-kernel@vger.kernel.org
Patchwork: https://patchwork.linux-mips.org/patch/1992/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Fold all the mmu_gather rework patches into one for submission
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Reported-by: Hugh Dickins <hughd@google.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Miller <davem@davemloft.net>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Tony Luck <tony.luck@intel.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Mel Gorman <mel@csn.ul.ie>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Under some combinations of CONFIG_*, lastpfn in page_is_ram is 'set
but not used'. Mark it as __maybe_unused to quiet the warning/error.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/2033/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mm: Unify kernel_physical_mapping_init() API
x86, mm: Allow highmem user page tables to be disabled at boot time
x86: Do not reserve brk for DMI if it's not going to be used
x86: Convert tlbstate_lock to raw_spinlock
x86: Use the generic page_is_ram()
x86: Remove BIOS data range from e820
Move page_is_ram() declaration to mm.h
Generic page_is_ram: use __weak
resources: introduce generic page_is_ram()
The SmartMIPS ASE specifies how Read Inhibit (RI) and eXecute Inhibit
(XI) bits in the page tables work. The upper two bits of EntryLo{0,1}
are RI and XI when the feature is enabled in the PageGrain register.
SmartMIPS only covers 32-bit systems. Cavium Octeon+ extends this to
64-bit systems by continuing to place the RI and XI bits in the top of
EntryLo even when EntryLo is 64-bits wide.
Because we need to carry the RI and XI bits in the PTE, the layout of
the PTE is changed. There is a two instruction overhead in the TLB
refill hot path to get the EntryLo bits into the proper position.
Also the TLB load exception has to probe the TLB to check if RI or XI
caused the exception.
Also of note is that the layout of the PTE bits is done at compile and
runtime rather than statically. In the 32-bit case this allows for
the same number of PFN bits as before the patch as the _PAGE_HUGE is
not supported in 32-bit kernels (we have _PAGE_NO_EXEC and
_PAGE_NO_READ instead of _PAGE_READ and _PAGE_HUGE).
The patch is tested on Cavium Octeon+, but should also work on 32-bit
systems with the Smart-MIPS ASE.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
To: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/952/
Patchwork: http://patchwork.linux-mips.org/patch/956/
Patchwork: http://patchwork.linux-mips.org/patch/962/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
For 64-bit kernels with 64KB pages and two level page tables, there are
42 bits worth of virtual address space This is larger than the 40 bits of
virtual address space obtained with the default 4KB Page size and three
levels, so there are no draw backs for using two level tables with this
configuration.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Cc: linux-mips@linux-mips.org
Patchwork: http://patchwork.linux-mips.org/patch/761/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
x86/mm is on 32-rc4 and missing the spinlock namespace changes which
are needed for further commits into this topic.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
It's based on walk_system_ram_range(), for archs that don't have
their own page_is_ram().
The static verions in MIPS and SCORE are also made global.
v4: prefer plain 1 instead of PAGE_IS_RAM (H. Peter Anvin)
v3: add comment (KAMEZAWA Hiroyuki)
"AFAIK, this "System RAM" information has been used for kdump to
grab valid memory area and seems good for the kernel itself."
v2: add PAGE_IS_RAM macro (Américo Wang)
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Américo Wang <xiyou.wangcong@gmail.com>
Cc: linux-mips@linux-mips.org
Cc: Yinghai Lu <yinghai@kernel.org>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Wu Fengguang <fengguang.wu@intel.com>
LKML-Reference: <20100122081619.GA6431@localhost>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
Makes it consistent with the extern declaration, used when CONFIG_HIGHMEM
is set Removes redundant casts in printout messages
Signed-off-by: Andreas Fenkart <andreas.fenkart@streamunlimited.com>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Processors that support the mips64r2 ISA can in four instructions
convert a shifted PGD pointer stored in the upper bits of c0_context
into a usable pointer. By doing this we save a memory load and
associated potential cache miss in the TLB exception handlers.
Since the upper bits of c0_context were holding the CPU number, we
move this to the upper bits of c0_xcontext which doesn't have enough
bits to hold the PGD pointer, but has plenty for the CPU number.
Signed-off-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
On an SMP system with cache aliases, the following sequence of events may
happen:
1) copy_user_highpage() runs on CPU0, invoking kmap_coherent() to create a
temporary mapping in the fixmap region
2) copy_page() starts on CPU0
3) CPU1 sends CPU0 an IPI asking CPU0 to run local_r4k_flush_cache_page()
4) CPU0 takes the interrupt, interrupting copy_page()
5) local_r4k_flush_cache_page() on CPU0 calls kmap_coherent() again
6) The second invocation of kmap_coherent() on CPU0 tries to use the
same fixmap virtual address that was being used by copy_user_highpage()
7) CPU0 throws a machine check exception for the TLB address conflict
Fixed by creating an extra set of fixmap entries for use in interrupt
handlers. This prevents fixmap VA conflicts between copy_user_highpage()
running in user context, and local_r4k_flush_cache_page() invoked from an
SMP IPI.
Signed-off-by: Kevin Cernekee <cernekee@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
For /proc/kcore, each arch registers its memory range by kclist_add().
In usual,
- range of physical memory
- range of vmalloc area
- text, etc...
are registered but "range of physical memory" has some troubles. It
doesn't updated at memory hotplug and it tend to include unnecessary
memory holes. Now, /proc/iomem (kernel/resource.c) includes required
physical memory range information and it's properly updated at memory
hotplug. Then, it's good to avoid using its own code(duplicating
information) and to rebuild kclist for physical memory based on
/proc/iomem.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For /proc/kcore, vmalloc areas are registered per arch. But, all of them
registers same range of [VMALLOC_START...VMALLOC_END) This patch unifies
them. By this. archs which have no kclist_add() hooks can see vmalloc
area correctly.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Presently, kclist_add() only eats start address and size as its arguments.
Considering to make kclist dynamically reconfigulable, it's necessary to
know which kclists are for System RAM and which are not.
This patch add kclist types as
KCORE_RAM
KCORE_VMALLOC
KCORE_TEXT
KCORE_OTHER
This "type" is used in a patch following this for detecting KCORE_RAM.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: WANG Cong <xiyou.wangcong@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 9617729941 ("Drop free_pages()")
modified nr_free_pages() to return 'unsigned long' instead of 'unsigned
int'. This made the casts to 'unsigned long' in most callers superfluous,
so remove them.
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com>
Reviewed-by: Christoph Lameter <cl@linux-foundation.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: WANG Cong <xiyou.wangcong@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Chris Zankel <zankel@tensilica.com>
Cc: Michal Simek <monstr@monstr.eu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
By combining swapper_pg_dir and module_pg_dir, several if conditions
can be eliminated from the tlb exception handler. The reason they
can be combined is that, the effective virtual address of vmalloc
returned is at the bottom, and of module_alloc returned is at the
top. It also fixes the bug in vmalloc(), which happens when its
return address is not covered by the first pgd.
Signed-off-by: Wu Fei <at.wufei@gmail.com>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Some of the were relying into smp.h being dragged in by another header
which of course is fragile. <asm/cpu-info.h> uses smp_processor_id()
only in macros and including smp.h there leads to an include loop, so
don't change cpu-info.h.
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
Commit 351336929c (kernel.org) rsp.
b3594a089f1c17ff919f8f78505c3f20e1f6f8ce (linux-mips.org):
> From: Chris Dearman <chris@mips.com>
> Date: Wed, 19 Sep 2007 00:58:24 +0100
> Subject: [PATCH] [MIPS] Allow setting of the cache attribute at run time.
>
> Slightly tacky, but there is a precedent in the sparc archirecture code.
introduces the variable _page_cachable_default, which defaults to zero and.
is used to create the prototype PTE for __kmap_atomic in
arch/mips/mm/init.c:kmap_init before initialization in
arch/mips/mm/c-r4k.c:coherency_setup, so the default value of 0 will be
used as the CCA of kmap atomic pages which on many processors is not a
defined CCA value and may result in writes to kmap_atomic pages getting
corrupted. Debugged by Jon Fraser (jfraser@broadcom.com).
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>