linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-15 08:46:45 +07:00

Author	SHA1	Message	Date
DaeSeok Youn	ff252c1fc5	kernel/fork.c: make dup_mm() static dup_mm() is used only in kernel/fork.c Signed-off-by: Daeseok Youn <daeseok.youn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:37:02 -08:00
Oleg Nesterov	74e37200de	proc: cleanup/simplify get_task_state/task_state_array get_task_state() and task_state_array[] look confusing and suboptimal, it is not clear what it can actually report to user-space and task_state_array[] blows .data for no reason. 1. state = (tsk->state & TASK_REPORT) \| tsk->exit_state is not clear. TASK_REPORT is self-documenting but it is not clear what ->exit_state can add. Move the potential exit_state's (EXIT_ZOMBIE and EXIT_DEAD) into TASK_REPORT and use it to calculate the final result. 2. With the change above it is obvious that task_state_array[] has the unused entries just to make BUILD_BUG_ON() happy. Change this BUILD_BUG_ON() to use TASK_REPORT rather than TASK_STATE_MAX and shrink task_state_array[]. 3. Turn the "while (state)" loop into fls(state). Signed-off-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: David Laight <David.Laight@ACULAB.COM> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:37:01 -08:00
Oleg Nesterov	942be3875a	coredump: make __get_dumpable/get_dumpable inline, kill fs/coredump.h 1. Remove fs/coredump.h. It is not clear why do we need it, it only declares __get_dumpable(), signal.c includes it for no reason. 2. Now that get_dumpable() and __get_dumpable() are really trivial make them inline in linux/sched.h. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Alex Kelly <alex.page.kelly@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Vasily Kulikov <segoon@openwall.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:37:01 -08:00
Oleg Nesterov	7288e1187b	coredump: kill MMF_DUMPABLE and MMF_DUMP_SECURELY Nobody actually needs MMF_DUMPABLE/MMF_DUMP_SECURELY, they are only used to enforce the encoding of SUID_DUMP_* enum in mm->flags & MMF_DUMPABLE_MASK. Now that set_dumpable() updates both bits atomically we can kill them and simply store the value "as is" in 2 lower bits. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Alex Kelly <alex.page.kelly@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Petr Matousek <pmatouse@redhat.com> Cc: Vasily Kulikov <segoon@openwall.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:37:01 -08:00
Axel Lin	0fa9aa20c3	fs/ramfs/file-nommu.c: make ramfs_nommu_get_unmapped_area() and ramfs_nommu_mmap() static Since commit `853ac43ab1` ("shmem: unify regular and tiny shmem"), ramfs_nommu_get_unmapped_area() and ramfs_nommu_mmap() are not directly referenced outside of file-nommu.c. Thus make them static. Signed-off-by: Axel Lin <axel.lin@ingics.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:58 -08:00
Joe Perches	c28aa1f0a8	printk/cache: mark printk_once test variable __read_mostly Add #include <linux/cache.h> to define __read_mostly. Convert cache.h to use uapi/linux/kernel.h instead of linux/kernel.h to avoid recursive #includes. Convert the ALIGN macro to __ALIGN_KERNEL. printk_once only sets the bool variable tested once so mark it __read_mostly. Neaten the alignment so it matches the rest of the pr_<level>_once #defines too. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: James Hogan <james.hogan@imgtec.com> Cc: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:56 -08:00
Du, Changbin	aace05097a	lib/parser.c: add match_wildcard() function match_wildcard function is a simple implementation of wildcard matching algorithm. It only supports two usual wildcardes: '*' - matches zero or more characters '?' - matches one character This algorithm is safe since it is non-recursive. Signed-off-by: Du, Changbin <changbin.du@gmail.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:55 -08:00
Mike Frysinger	0d9dfc23f4	uapi: convert u64 to __u64 in exported headers The u64 type is not defined in any exported kernel headers, so trying to use it will lead to build failures. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Acked-by: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:55 -08:00
Mike Frysinger	c318924582	include/uapi/linux/dn.h: pull in ioctl.h header This header uses _IOW/_IOR defines but doesn't include ioctl.h for it. If you try to use this w/out including ioctl.h yourself, it can fail to build, so add the explicit include. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:55 -08:00
Mike Frysinger	e8b6714604	include/uapi/linux/ppp-ioctl.h: pull in ppp_defs.h This header uses enum NPmode but doesn't include ppp_defs.h. If you try to use this header w/out including the defs header first, it leads to a build failure. So add the explicit include to fix it. Don't know of any packages directly impacted, but noticed while building some ppp code by hand. Signed-off-by: Mike Frysinger <vapier@gentoo.org> Cc: Paul Mackerras <paulus@samba.org> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:55 -08:00
David Howells	00b2c76a6a	include/linux/of.h: make for_each_child_of_node() reference its args when CONFIG_OF=n Make for_each_child_of_node() reference its args when CONFIG_OF=n to avoid warnings like: drivers/leds/leds-pwm.c:88:22: warning: unused variable 'node' [-Wunused-variable] struct device_node *node = pdev->dev.of_node; ^ Signed-off-by: David Howells <dhowells@redhat.com> Cc: Grant Likely <grant.likely@linaro.org> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:55 -08:00
Alex Elder	04f9b74e4d	remove extra definitions of U32_MAX Now that the definition is centralized in <linux/kernel.h>, the definitions of U32_MAX (and related) elsewhere in the kernel can be removed. Signed-off-by: Alex Elder <elder@linaro.org> Acked-by: Sage Weil <sage@inktank.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:55 -08:00
Alex Elder	89a0714106	kernel.h: define u8, s8, u32, etc. limits Create constants that define the maximum and minimum values representable by the kernel types u8, s8, u16, s16, and so on. Signed-off-by: Alex Elder <elder@linaro.org> Cc: Sage Weil <sage@inktank.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:54 -08:00
Alex Elder	77719536dc	conditionally define U32_MAX The symbol U32_MAX is defined in several spots. Change these definitions to be conditional. This is in preparation for the next patch, which centralizes the definition in <linux/kernel.h>. Signed-off-by: Alex Elder <elder@linaro.org> Cc: Sage Weil <sage@inktank.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:54 -08:00
Mark Salter	d57c33c5da	add generic fixmap.h Many architectures provide an asm/fixmap.h which defines support for compile-time 'special' virtual mappings which need to be made before paging_init() has run. This support is also used for early ioremap on x86. Much of this support is identical across the architectures. This patch consolidates all of the common bits into asm-generic/fixmap.h which is intended to be included from arch/*/include/asm/fixmap.h. Signed-off-by: Mark Salter <msalter@redhat.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Ralf Baechle <ralf@linux-mips.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: James Hogan <james.hogan@imgtec.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Jonas Bonn <jonas.bonn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:54 -08:00
Geert Uytterhoeven	0c79a8e29b	asm/types.h: Remove include/asm-generic/int-l64.h Now all 64-bit architectures have been converted to int-ll64.h, we can remove int-l64.h in kernelspace. For backwards compatibility, alpha, ia64, mips64, and powerpc64 still use int-l64.h in userspace. This is the (reworked for UAPI) non-documentation part of more than two year old "asm/types.h: All architectures use int-ll64.h in kernelspace" (https://lkml.org/lkml/2011/8/13/104) Since <asm/types.h> (from include/uapi/asm-generic/types.h) is used for both kernel and user space, include/asm-generic/int-ll64.h cannot just become include/asm-generic/types.h, as Arnd suggested. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:53 -08:00
Shawn Guo	b30afea019	include/linux/genalloc.h: spinlock_t needs spinlock_types.h Compiling a C file which includes genalloc.h but without spinlock_types.h being included before, we will see the compile error below. include/linux/genalloc.h:54:2: error: unknown type name `spinlock_t' Include spinlock_types.h from genalloc.h to fix the problem. Signed-off-by: Shawn Guo <shawn.guo@linaro.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:52 -08:00
Andi Kleen	54a43d5498	numa: add a sysctl for numa_balancing Add a working sysctl to enable/disable automatic numa memory balancing at runtime. This allows us to track down performance problems with this feature and is generally a good idea. This was possible earlier through debugfs, but only with special debugging options set. Also fix the boot message. [akpm@linux-foundation.org: s/sched_numa_balancing/sysctl_numa_balancing/] Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Philipp Hachtmann	5e270e2548	mm: free memblock.memory in free_all_bootmem When calling free_all_bootmem() the free areas under memblock's control are released to the buddy allocator. Additionally the reserved list is freed if it was reallocated by memblock. The same should apply for the memory list. Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tang Chen <tangchen@cn.fujitsu.com> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Jianguo Wu <wujianguo@huawei.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	f8570263ee	memcg, slab: RCU protect memcg_params for root caches We relocate root cache's memcg_params whenever we need to grow the memcg_caches array to accommodate all kmem-active memory cgroups. Currently on relocation we free the old version immediately, which can lead to use-after-free, because the memcg_caches array is accessed lock-free (see cache_from_memcg_idx()). This patch fixes this by making memcg_params RCU-protected for root caches. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	1aa1325425	memcg, slab: clean up memcg cache initialization/destruction Currently, we have rather a messy function set relating to per-memcg kmem cache initialization/destruction. Per-memcg caches are created in memcg_create_kmem_cache(). This function calls kmem_cache_create_memcg() to allocate and initialize a kmem cache and then "registers" the new cache in the memcg_params::memcg_caches array of the parent cache. During its work-flow, kmem_cache_create_memcg() executes the following memcg-related functions: - memcg_alloc_cache_params(), to initialize memcg_params of the newly created cache; - memcg_cache_list_add(), to add the new cache to the memcg_slab_caches list. On the other hand, kmem_cache_destroy() called on a cache destruction only calls memcg_release_cache(), which does all the work: it cleans the reference to the cache in its parent's memcg_params::memcg_caches, removes the cache from the memcg_slab_caches list, and frees memcg_params. Such an inconsistency between destruction and initialization paths make the code difficult to read, so let's clean this up a bit. This patch moves all the code relating to registration of per-memcg caches (adding to memcg list, setting the pointer to a cache from its parent) to the newly created memcg_register_cache() and memcg_unregister_cache() functions making the initialization and destruction paths look symmetrical. Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Vladimir Davydov	363a044f73	memcg, slab: kmem_cache_create_memcg(): fix memleak on fail path We do not free the cache's memcg_params if __kmem_cache_create fails. Fix this. Plus, rename memcg_register_cache() to memcg_alloc_cache_params(), because it actually does not register the cache anywhere, but simply initialize kmem_cache::memcg_params. [akpm@linux-foundation.org: fix build] Signed-off-by: Vladimir Davydov <vdavydov@parallels.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Glauber Costa <glommer@gmail.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Balbir Singh <bsingharora@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:51 -08:00
Sasha Levin	309381feae	mm: dump page when hitting a VM_BUG_ON using VM_BUG_ON_PAGE Most of the VM_BUG_ON assertions are performed on a page. Usually, when one of these assertions fails we'll get a BUG_ON with a call stack and the registers. I've recently noticed based on the requests to add a small piece of code that dumps the page to various VM_BUG_ON sites that the page dump is quite useful to people debugging issues in mm. This patch adds a VM_BUG_ON_PAGE(cond, page) which beyond doing what VM_BUG_ON() does, also dumps the page before executing the actual BUG_ON. [akpm@linux-foundation.org: fix up includes] Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Cc: "Kirill A. Shutemov" <kirill@shutemov.name> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Dave Hansen	f0b791a34c	mm: print more details for bad_page() bad_page() is cool in that it prints out a bunch of data about the page. But, I can never remember which page flags are good and which are bad, or whether ->index or ->mapping is required to be NULL. This patch allows bad/dump_page() callers to specify a string about why they are dumping the page and adds explanation strings to a number of places. It also adds a 'bad_flags' argument to bad_page(), which it then dumps out separately from the flags which are actually set. This way, the messages will show specifically why the page was bad, specifically which flags it is complaining about, if it was a page flag combination which was the problem. [akpm@linux-foundation.org: switch to pr_alert] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Christoph Lameter <cl@linux.com> Cc: Andi Kleen <andi@firstfloor.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-23 16:36:50 -08:00
Jiri Pirko	237266f76d	rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-23 13:40:51 -08:00
Linus Torvalds	5ee7a81a9f	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse Pull fuse update from Miklos Szeredi: "This contains a fix for a potential use-after-module-unload bug noticed by Al and caching improvements for read-only fuse filesystems by Andrew Gallagher" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: fuse: support clients that don't implement 'open' fuse: don't invalidate attrs when not using atime fuse: fix SetPageUptodate() condition in STORE fuse: fix pipe_buf_operations	2014-01-23 09:22:58 -08:00
Linus Torvalds	0d90d63872	f2fs updates for v3.14 This patch-set includes the following major enhancement patches. o support inline_data o refactor bio operations such as merge operations and rw type assignment o enhance the direct IO path o enhance bio operations o truncate a node page when it becomes obsolete o add sysfs entries: small_discards, max_victim_search, and in-place-update o add a sysfs entry to control max_victim_search The other bug fixes are as follows. o fix a bug in truncate_partial_nodes o avoid warnings during sparse and build process o fix error handling flows o fix potential bit overflows And, there are a bunch of cleanups. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJS4HQfAAoJEEAUqH6CSFDSyyMP/iUXSMC9yw6eOmSjAh3boc6+ C7e4zrhdovekGTuZgg41SLdr83cpbEohv11wcXAfxB+eYFEz0zrAVzt54zMi7uOL 9JmFJ6XVL/T3omI5hpEwWHg6S6tOynN6mcjacsrvypEekgjHbbpLudSw6SCu3dKz Lpc3z6CxrWbhvX8Iyf1j8mCceWkTO6eRv7u2H4Njtsq4Tukw3BHiBsURXt6kGwpx CvRBgCFdQhv4GAtbDosmVjNWOUxvik7w2epHAPQGddFTgaCL9uS+gfweHK6H9EDp 1e3BDhmn5r9IhiLY8KVXRc8+po9kQeO1jNQATBuWggfjJSGbEBmrEQX4MFE3uCi9 q84hGV9+yaJxoT2A21qIeWgorF9gjqNbnrrENKHyKhOqXJSrh48u5LUV8KqIyz1Y Qw62cypEB+PQxWegN76vwX/OrHMCLYMQ6c78bYLSwkBKonOrF5sN2+kJW5+zEj6n q2cYi1PLMJe7LTcULUrxJTSPFLKM5yA2oYZq3LN4sUYBeN6USaouaIqcZBqRBTCO adqlTa3sWytkDMAHsTpwrHABKK7pwiZoPLDVwjo0TIJ6Us4JhDtTktp5pj24fQ7Y 6lC9w4VbfAKtq8fMV17rZYD0lQFlmZk4uQRJ8XYicCRFx11kMPKYzdGmP5aVXWru wxcztktnABtCAXK0PFLf =gVDh -----END PGP SIGNATURE----- Merge tag 'for-f2fs-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs Pull f2fs updates from Jaegeuk Kim: "In this round, a couple of sysfs entries were introduced to tune the f2fs at runtime. In addition, f2fs starts to support inline_data and improves the read/write performance in some workloads by refactoring bio-related flows. This patch-set includes the following major enhancement patches. - support inline_data - refactor bio operations such as merge operations and rw type assignment - enhance the direct IO path - enhance bio operations - truncate a node page when it becomes obsolete - add sysfs entries: small_discards, max_victim_search, and in-place-update - add a sysfs entry to control max_victim_search The other bug fixes are as follows. - fix a bug in truncate_partial_nodes - avoid warnings during sparse and build process - fix error handling flows - fix potential bit overflows And, there are a bunch of cleanups" * tag 'for-f2fs-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (95 commits) f2fs: drop obsolete node page when it is truncated f2fs: introduce NODE_MAPPING for code consistency f2fs: remove the orphan block page array f2fs: add help function META_MAPPING f2fs: move a branch for code redability f2fs: call mark_inode_dirty to flush dirty pages f2fs: clean checkpatch warnings f2fs: missing REQ_META and REQ_PRIO when sync_meta_pages(META_FLUSH) f2fs: avoid f2fs_balance_fs call during pageout f2fs: add delimiter to seperate name and value in debug phrase f2fs: use spinlock rather than mutex for better speed f2fs: move alloc new orphan node out of lock protection region f2fs: move grabing orphan pages out of protection region f2fs: remove the needless parameter of f2fs_wait_on_page_writeback f2fs: update documents and a MAINTAINERS entry f2fs: add a sysfs entry to control max_victim_search f2fs: improve write performance under frequent fsync calls f2fs: avoid to read inline data except first page f2fs: avoid to left uninitialized data in page when read inline data f2fs: fix truncate_partial_nodes bug ...	2014-01-23 09:21:09 -08:00
Olof Johansson	e41006c22e	Split omap3 core padconf area into two as some of the registers in the padconf area are not accessible and used for other devices. Also fix the interrupt number for omap2 RNG, and add basic support for sbc-3xxx with cm-t3730. Note that the minor merge conflicts for omap_hwmod_2xxx_ipblock_data.c can be solved by just dropping the legacy hwmod data for interrupts for v3.14. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQIcBAABAgAGBQJSzaM5AAoJEBvUPslcq6VzhSwQAKM4vtw0eeLTGXURMs7KdCfX B6Qo+/DW7dEkrIcFVY8n5CfKRpb48KZ4RyN7Yff5HqkPzB5+/rd5yrruzG3zPwNl q5WTxCwuJo/QGXGsEA5W0PNx22WxmC2hcburzaRb/CK8XWwAuyEQ5NU14Az+d/Lf icJAk98c4DX60Lyr743v3Tow/tJJYEisJSUxPlqoUU50oX73we7BAhGh19B40jC8 2gUXUWxaLhDT+Kry/h9R0fJD81KdPZ185GWUHnp81DHq2wj8tzclmKmic7lB9ycr lyZXff0dYSLw1Ufupi8LaLd0ULi4uFrYSD/ZgaZmTJuduSG0TXOOn5MfrPcU7Upq +vAY3YehOdMx8kyCx6rybrH/joLKdoIPT0Pqb6JCCVvnoE77fXJqRcdmfHEeUV6Z f+keyZKdiVmn1dOB7slKqTco5f+bgNh1/sQ074RJKFggLIskSbfHtjxD2S4UyVFq fK2Ch90+fWIyVP7XhSec8SKdna3mZyigMBWimf2oMmTeq/zyNLKAWj5PTzX4wch2 UAhJDTA+A/PXuriD1Hl1k6hgV+Udsndy4r0hJfwSDHgF+MP7FU1e/bceA9s0KJ5z 21DcIkSqfOdxZvM61S8uVhebBPPPrJDCk18YsZS0KyndIaI2MZdJXn8J/E7HkK1l 9JG5BwDb75nrcTrgCj/9 =sPMZ -----END PGP SIGNATURE----- Merge tag 'omap-for-v3.14/dt-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/boards Split omap3 core padconf area into two as some of the registers in the padconf area are not accessible and used for other devices. Also fix the interrupt number for omap2 RNG, and add basic support for sbc-3xxx with cm-t3730. Note that the minor merge conflicts for omap_hwmod_2xxx_ipblock_data.c can be solved by just dropping the legacy hwmod data for interrupts for v3.14. * tag 'omap-for-v3.14/dt-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: OMAP2: fix interrupt number for rng ARM: dts: Split omap3 pinmux core device ARM: dts: Add omap specific pinctrl defines to use padconf addresses ARM: dts: Add support for sbc-3xxx with cm-t3730 Signed-off-by: Olof Johansson <olof@lixom.net>	2014-01-23 08:51:37 -08:00
Dmitry Torokhov	55df811f20	Merge branch 'next' into for-linus First round of input updates for 3.14.	2014-01-23 08:10:44 -08:00
Peter Zijlstra	215393bc1f	sched/preempt/x86: Fix voluntary preempt for x86 The #ifdef CONFIG_PREEMPT is both not needed and wrong. Its not required because asm/preempt.h should provide {set,clear}_preempt_need_resched() regardless and its wrong because for voluntary preempt we still rely on PREEMPT_NEED_RESCHED. Reported-and-Tested-by: Markus Trippelsdorf <markus@trippelsdorf.de> Fixes: `8cb75e0c4e` ("sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding") Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Dipankar Sarma <dipankar@in.ibm.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/r/20140122102435.GH31570@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>	2014-01-23 14:48:35 +01:00
Mark Brown	8aeab58e56	Merge remote-tracking branches 'spi/topic/pxa2xx', 'spi/topic/qspi', 'spi/topic/s3c24xx', 'spi/topic/s3c64xx', 'spi/topic/sh', 'spi/topic/tegra114', 'spi/topic/tegra20-sflash', 'spi/topic/tegra20-slink', 'spi/topic/txx9' and 'spi/topic/xcomm' into spi-linus	2014-01-23 13:07:14 +00:00
Mark Brown	907e26b6f5	Merge remote-tracking branches 'spi/topic/fsl-espi', 'spi/topic/gpio', 'spi/topic/hspi', 'spi/topic/mpc512x', 'spi/topic/msiof', 'spi/topic/nuc900', 'spi/topic/oc-tiny', 'spi/topic/omap', 'spi/topic/orion' and 'spi/topic/pci' into spi-linus	2014-01-23 13:07:09 +00:00
Mark Brown	3c1039745e	Merge remote-tracking branch 'spi/topic/core' into spi-linus	2014-01-23 13:07:01 +00:00
Mark Brown	7e2c225d58	Merge remote-tracking branch 'spi/fix/core' into spi-linus	2014-01-23 13:07:00 +00:00
Mark Brown	07b1980848	Merge remote-tracking branches 'regulator/topic/s2mps11', 'regulator/topic/s5m8767', 'regulator/topic/stw481x-vmmc', 'regulator/topic/tps51632', 'regulator/topic/tps62360', 'regulator/topic/tps65910', 'regulator/topic/twl' and 'regulator/topic/wm831x' into regulator-linus	2014-01-23 12:01:31 +00:00
Mark Brown	849e1517a4	Merge remote-tracking branches 'regulator/fix/pfuze100', 'regulator/fix/s5m8767', 'regulator/topic/ab8500', 'regulator/topic/act8865', 'regulator/topic/anatop', 'regulator/topic/arizona' and 'regulator/topic/as3722' into regulator-linus	2014-01-23 12:01:24 +00:00
Roland Dreier	fb1b5034e4	Merge branch 'ip-roce' into for-next Conflicts: drivers/infiniband/hw/mlx4/main.c	2014-01-22 23:24:21 -08:00
Roland Dreier	8f399921ea	Merge branches 'cma', 'cxgb4', 'flowsteer', 'ipoib', 'misc', 'mlx4', 'mlx5', 'ocrdma', 'qib', 'srp' and 'usnic' into for-next	2014-01-22 23:24:13 -08:00
Eli Cohen	8c8a49148b	IB/mlx5: Remove old field for create mkey mailbox Match firmware specification. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-01-22 23:23:54 -08:00
Eli Cohen	05bdb2ab6b	mlx5_core: Fix PowerPC support 1. Fix derivation of sub-page index from the dma address in free_4k. 2. Fix the DMA address passed to dma_unmap_page by masking it properly. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-01-22 23:23:52 -08:00
Eli Cohen	db81a5c374	mlx5_core: Improve debugfs readability Use strings to display transport service or state of QPs. Use numeric value for MTU of a QP. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-01-22 23:23:50 -08:00
Eli Cohen	bde51583f4	IB/mlx5: Add support for resize CQ Implement resize CQ which is a mandatory verb in mlx5. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-01-22 23:23:50 -08:00
Eli Cohen	3bdb31f688	IB/mlx5: Implement modify CQ Modify CQ is used by ULPs like IPoIB to change moderation parameters. This patch adds support in mlx5. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>	2014-01-22 23:23:49 -08:00
Linus Torvalds	0dc3fd0249	You'll get a merge issue on Documentation/module-signing.txt: the one in your tree is more recent, so ignore mine please. Cheers, Rusty. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.14 (GNU/Linux) iQIcBAABAgAGBQJS4IBbAAoJENkgDmzRrbjxvv4QAJRyid7F0kMMaImqNtLw+8Sc 9QouX5nE0pt9VIDVs0v3JaJVcJrFrML498DaWM8yi1M4l11EIgWH0GYJ2DLHt3yf F9AGNmwZRzSJnPVyG7HWOlMbqycwh95P3ldJ3NmGClI++BKI/Qn9ffa/hdy7O+1E y7K0TWQQmG1HgIA7BVY491bzQTiUtelqrKShup0TQ6hRYYESs1g+JOClQgeGiFLG Q3JInsYjc7mOtijdfivln4sZKbZjW8oP1Sud/3FrXfLEyS6WiZPDSE82H17ZpZrc q+tlGXziXLJKo8G3jcjCDGvM0GsMDZZe5LJQ7Ax28K73I0o9N7JTbC1cePBg3AWz YW7tWS3DQ9aNfQOum6aCFv9Nu9RadsIVOqMCDdlULbe7fCtK0PeIBxdgLZQOBauM Z5+fCwtfiW6VASnz+UFo+n6V5yzmOEsJToem1a8+UySnlKuO+NkdACBOhCxfditj 5nfOmj1rwoTKacc0jqTZ1twVBfVDSa3PYEqFf9e9DYx17ZjKEFZ0FV9NRX3EZnaN n5Z/Pw+jWMDL1FXFBMNfHUpfmq1spAul+h2+OMMc7+91v1TmRI4Xbeu1jWJeYv+S 0woqfOEHhB+0a6vgxhsXg2SWkyTGHGq/UdX0J4tvN2txKwde8zi2f2TIDdyDhSw9 YWghWp9FuiEuKJ8UgRX5 =9fZu -----END PGP SIGNATURE----- Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module updates from Rusty Russell. * tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: module: Add missing newline in printk call. module: fix coding style export: declare ksymtab symbols module.h: Remove unnecessary semicolon params: improve standard definitions Add Documentation/module-signing.txt file	2014-01-22 22:30:15 -08:00
Linus Torvalds	84621c9b18	Features: - FIFO event channels. Key advantages: support for over 100,000 events (2^17), 16 different event priorities, improved fairness in event latency through the use of FIFOs. - Xen PVH support. "It’s a fully PV kernel mode, running with paravirtualized disk and network, paravirtualized interrupts and timers, no emulated devices of any kind (and thus no qemu), no BIOS or legacy boot — but instead of requiring PV MMU, it uses the HVM hardware extensions to virtualize the pagetables, as well as system calls and other privileged operations." (from "The Paravirtualization Spectrum, Part 2: From poles to a spectrum") Bug-fixes: - Fixes in balloon driver (refactor and make it work under ARM) - Allow xenfb to be used in HVM guests. - Allow xen_platform_pci=0 to work properly. - Refactors in event channels. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJS4BmLAAoJEFjIrFwIi8fJ4SAH/iNGESowgMhfW64vRA8pBWq+ NRJpUjYjjwmbxpwoNl6NPwn15cIXFyc3sMtvvrDD3taRDyko2RFuT+NTjpO05xPh d/cRpRXpXERHoiFgPf/WTp7ONBDhvPtHG0+BzJKwgqEIOUYXdbhD+gEjaVlFJScS CAY68OLmk7XYMSZBNzPfKNbSCyhVgZF7wpaimK9lxZBKsFRCDIq6jIyrAsC8epIL 6V/V4l2S6lk/uUeGB6ULphYeINjI2kkpbSfCd1vyenLfWpVscc2o8uWEYFcZMAxy V4HpsoseuqrfdDqgPfud3VgogdISvbkCvDfW85rzfDP4MWxei2mVHFtJ/gSBV+g= =ToNG -----END PGP SIGNATURE----- Merge tag 'stable/for-linus-3.14-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen updates from Konrad Rzeszutek Wilk: "Two major features that Xen community is excited about: The first is event channel scalability by David Vrabel - we switch over from an two-level per-cpu bitmap of events (IRQs) - to an FIFO queue with priorities. This lets us be able to handle more events, have lower latency, and better scalability. Good stuff. The other is PVH by Mukesh Rathor. In short, PV is a mode where the kernel lets the hypervisor program page-tables, segments, etc. With EPT/NPT capabilities in current processors, the overhead of doing this in an HVM (Hardware Virtual Machine) container is much lower than the hypervisor doing it for us. In short we let a PV guest run without doing page-table, segment, syscall, etc updates through the hypervisor - instead it is all done within the guest container. It is a "hybrid" PV - hence the 'PVH' name - a PV guest within an HVM container. The major benefits are less code to deal with - for example we only use one function from the the pv_mmu_ops (which has 39 function calls); faster performance for syscall (no context switches into the hypervisor); less traps on various operations; etc. It is still being baked - the ABI is not yet set in stone. But it is pretty awesome and we are excited about it. Lastly, there are some changes to ARM code - you should get a simple conflict which has been resolved in #linux-next. In short, this pull has awesome features. Features: - FIFO event channels. Key advantages: support for over 100,000 events (2^17), 16 different event priorities, improved fairness in event latency through the use of FIFOs. - Xen PVH support. "It’s a fully PV kernel mode, running with paravirtualized disk and network, paravirtualized interrupts and timers, no emulated devices of any kind (and thus no qemu), no BIOS or legacy boot — but instead of requiring PV MMU, it uses the HVM hardware extensions to virtualize the pagetables, as well as system calls and other privileged operations." (from "The Paravirtualization Spectrum, Part 2: From poles to a spectrum") Bug-fixes: - Fixes in balloon driver (refactor and make it work under ARM) - Allow xenfb to be used in HVM guests. - Allow xen_platform_pci=0 to work properly. - Refactors in event channels" * tag 'stable/for-linus-3.14-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (52 commits) xen/pvh: Set X86_CR0_WP and others in CR0 (v2) MAINTAINERS: add git repository for Xen xen/pvh: Use 'depend' instead of 'select'. xen: delete new instances of __cpuinit usage xen/fb: allow xenfb initialization for hvm guests xen/evtchn_fifo: fix error return code in evtchn_fifo_setup() xen-platform: fix error return code in platform_pci_init() xen/pvh: remove duplicated include from enlighten.c xen/pvh: Fix compile issues with xen_pvh_domain() xen: Use dev_is_pci() to check whether it is pci device xen/grant-table: Force to use v1 of grants. xen/pvh: Support ParaVirtualized Hardware extensions (v3). xen/pvh: Piggyback on PVHVM XenBus. xen/pvh: Piggyback on PVHVM for grant driver (v4) xen/grant: Implement an grant frame array struct (v3). xen/grant-table: Refactor gnttab_init xen/grants: Remove gnttab_max_grant_frames dependency on gnttab_init. xen/pvh: Piggyback on PVHVM for event channels (v2) xen/pvh: Update E820 to work with PVH (v2) xen/pvh: Secondary VCPU bringup (non-bootup CPUs) ...	2014-01-22 22:00:18 -08:00
Jiri Pirko	a9517d0f43	rtnetlink: remove ndo_get_slave No longer used API bond-specific can be removed now. This is now handled in a generic way in rtnl_link_ops. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-22 21:57:17 -08:00
Jiri Pirko	ba7d49b1f0	rtnetlink: provide api for getting and setting slave info Recent patch bonding: add netlink attributes to slave link dev (`1d3ee88ae0`) Introduced yet another device specific way to access slave information over rtnetlink. There is one already there for bridge. This patch introduces generic way to do this, for getting and setting info as well by extending link_ops. Later on, this new interface will be used for bridge ports as well. Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-22 21:57:05 -08:00
Jiri Pirko	df7dbcbbaf	rtnetlink: put "BOND" into nl attribute names which are related to bonding Signed-off-by: Jiri Pirko <jiri@resnulli.us> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-22 21:57:05 -08:00
FX Le Bail	7c90cc2d40	ipv6: enable anycast addresses as source addresses for datagrams This change allows to consider an anycast address valid as source address when given via an IPV6_PKTINFO or IPV6_2292PKTINFO ancillary data item. So, when sending a datagram with ancillary data, the unicast and anycast addresses are handled in the same way. - Adds ipv6_chk_acast_addr_src() to check if an anycast address is link-local on given interface or is global. - Uses it in ip6_datagram_send_ctl(). Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-22 21:57:05 -08:00
FX Le Bail	eb97768acb	net: update comments of "struct msghdr" with the more accurate RFC3542 ones Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-22 21:57:05 -08:00
Linus Torvalds	7ebd3faa9b	First round of KVM updates for 3.14; PPC parts will come next week. Nothing major here, just bugfixes all over the place. The most interesting part is the ARM guys' virtualized interrupt controller overhaul, which lets userspace get/set the state and thus enables migration of ARM VMs. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJS3TVKAAoJEBvWZb6bTYbyIFgP/2cmt4ifCuFMaZv4+G1S8jZU uC9ZB/+7vzht/p6zAy+4BxurKbHmSBFkC1OKcxYuy7yB4CQkHabzj4V2vRtqFdwH 5lExP9qh3kqaVLuhnvxLTmkktR3EW4PFy6OI53l5kRNktOXSuZ0aN6K3V7tCg/X0 iL7ASo4bJKlxeWcDpmuVrNgAajmZVfXrjKY7robgBQno+yIsgKhRZRBQHjozA6B8 FpCo/k48RZd/EzIbV/PDDRI4hmmry/lgrO9SKjzq56wSqff2bd/k/KYze4dbAPfd Ps60enPTuHmeEjjb4MMMU4EKHVdTQFUMx/xZCmT4xzoh8s4of6RHphXbfE0SUznQ dTveyEQAR7E3JNS0k1+3WEX5fWlFesp0hO2NeE0wzUq4TAr9ztgVO9NQ6Si15e7Z 2HysO0T5Ojtt0lY08/PvS6i48eCAuuBomrejJS8hLW4SUZ5adn+yW4Qo7Fp9JeBR l9a3LsVT8BZMtUWrUuFcVhlM4MbzElUPjDbgWhR8UYU/kpfVZOQu8qWgGKR4UWXy X7/t9l/tjR99CmfMJBAOzJid+ScSpAfg77BdaKiQrVfVIJmsjEjlO8vUMyj5b1HF hPX5wNyJjHAOfridLeHSs4Rdm4a8sk8Az5d4h76pLVz8M4jyTi2v0rO3N4/dU/pu x7N8KR5hAj+mLBoM9/Al =8sYU -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM updates from Paolo Bonzini: "First round of KVM updates for 3.14; PPC parts will come next week. Nothing major here, just bugfixes all over the place. The most interesting part is the ARM guys' virtualized interrupt controller overhaul, which lets userspace get/set the state and thus enables migration of ARM VMs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits) kvm: make KVM_MMU_AUDIT help text more readable KVM: s390: Fix memory access error detection KVM: nVMX: Update guest activity state field on L2 exits KVM: nVMX: Fix nested_run_pending on activity state HLT KVM: nVMX: Clean up handling of VMX-related MSRs KVM: nVMX: Add tracepoints for nested_vmexit and nested_vmexit_inject KVM: nVMX: Pass vmexit parameters to nested_vmx_vmexit KVM: nVMX: Leave VMX mode on clearing of feature control MSR KVM: VMX: Fix DR6 update on #DB exception KVM: SVM: Fix reading of DR6 KVM: x86: Sync DR7 on KVM_SET_DEBUGREGS add support for Hyper-V reference time counter KVM: remove useless write to vcpu->hv_clock.tsc_timestamp KVM: x86: fix tsc catchup issue with tsc scaling KVM: x86: limit PIT timer frequency KVM: x86: handle invalid root_hpa everywhere kvm: Provide kvm_vcpu_eligible_for_directed_yield() stub kvm: vfio: silence GCC warning KVM: ARM: Remove duplicate include arm/arm64: KVM: relax the requirements of VMA alignment for THP ...	2014-01-22 21:40:43 -08:00
Linus Torvalds	bb1281f2aa	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree updates from Jiri Kosina: "Usual rocket science stuff from trivial.git" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) neighbour.h: fix comment sched: Fix warning on make htmldocs caused by wait.h slab: struct kmem_cache is protected by slab_mutex doc: Fix typo in USB Gadget Documentation of/Kconfig: Spelling s/one/once/ mkregtable: Fix sscanf handling lp5523, lp8501: comment improvements thermal: rcar: comment spelling treewide: fix comments and printk msgs IXP4xx: remove '1 &&' from a condition check in ixp4xx_restart() Documentation: update /proc/uptime field description Documentation: Fix size parameter for snprintf arm: fix comment header and macro name asm-generic: uaccess: Spelling s/a ny/any/ mtd: onenand: fix comment header doc: driver-model/platform.txt: fix a typo drivers: fix typo in DEVTMPFS_MOUNT Kconfig help text doc: Fix typo (acces_process_vm -> access_process_vm) treewide: Fix typos in printk drivers/gpu/drm/qxl/Kconfig: reformat the help text ...	2014-01-22 21:21:55 -08:00
Linus Torvalds	fe41c2c018	A set of device-mapper changes for 3.14. A lot of attention was paid to improving the thin-provisioning target's handling of metadata operation failures and running out of space. A new 'error_if_no_space' feature was added to allow users to error IOs rather than queue them when either the data or metadata space is exhausted. Additional fixes/features include: - a few fixes to properly support thin metadata device resizing - a solution for reliably waiting for a DM device's embedded kobject to be released before destroying the device - old dm-snapshot is updated to use the dm-bufio interface to take advantage of readahead capabilities that improve snapshot activation - new dm-cache target tunables to control how quickly data is promoted to the cache (fast) device - improved write efficiency of cluster mirror target by combining userspace flush and mark requests -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJS4GClAAoJEMUj8QotnQNacdEH/2ES5k5itUQRY9jeI+u2zYNP vdsRTYf+97+B3jpRvpWbMt4kxT2tjaQbkxJ+iKRHy2MBLFUgq8ruH1RS/Q5VbDeg 6i6ol8mpNxhlvo/KTMxXqRcWDSxShiMfhz2lXC2bJ7M4sP/iiH85s4Pm4YQ59jpd OIX7qj36m/cV/le9YQbexJEEsaj+3genbzL26wyyvtG/rT9fWnXa7clj2gqTdToG YCEBCRf5FH9X6W/Oc50nMw5n2dt/MRmPre/MAlOjemeaosB0WJiKaswM25rnvHp0 JnhxQ2K2C5KIKAWIfwPOImdb9zWW7p1dIRLsS8nHBUQr0BF5VRkmvpnYH4qBtcc= =e7e0 -----END PGP SIGNATURE----- Merge tag 'dm-3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device-mapper changes from Mike Snitzer: "A lot of attention was paid to improving the thin-provisioning target's handling of metadata operation failures and running out of space. A new 'error_if_no_space' feature was added to allow users to error IOs rather than queue them when either the data or metadata space is exhausted. Additional fixes/features include: - a few fixes to properly support thin metadata device resizing - a solution for reliably waiting for a DM device's embedded kobject to be released before destroying the device - old dm-snapshot is updated to use the dm-bufio interface to take advantage of readahead capabilities that improve snapshot activation - new dm-cache target tunables to control how quickly data is promoted to the cache (fast) device - improved write efficiency of cluster mirror target by combining userspace flush and mark requests" * tag 'dm-3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (35 commits) dm log userspace: allow mark requests to piggyback on flush requests dm space map metadata: fix bug in resizing of thin metadata dm cache: add policy name to status output dm thin: fix pool feature parsing dm sysfs: fix a module unload race dm snapshot: use dm-bufio prefetch dm snapshot: use dm-bufio dm snapshot: prepare for switch to using dm-bufio dm snapshot: use GFP_KERNEL when initializing exceptions dm cache: add block sizes and total cache blocks to status output dm btree: add dm_btree_find_lowest_key dm space map metadata: fix extending the space map dm space map common: make sure new space is used during extend dm: wait until embedded kobject is released before destroying a device dm: remove pointless kobject comparison in dm_get_from_kobject dm snapshot: call destroy_work_on_stack() to pair with INIT_WORK_ONSTACK() dm cache policy mq: introduce three promotion threshold tunables dm cache policy mq: use list_del_init instead of list_del + INIT_LIST_HEAD dm thin: fix set_pool_mode exposed pool operation races dm thin: eliminate the no_free_space flag ...	2014-01-22 20:17:48 -08:00
Neil Horman	2d36097d26	af_packet: Add Queue mapping mode to af_packet fanout operation This patch adds a queue mapping mode to the fanout operation of af_packet sockets. This allows user space af_packet users to better filter on flows ingressing and egressing via a specific hardware queue, and avoids the potential packet reordering that can occur when FANOUT_CPU is being used and irq affinity varies. Tested successfully by myself. applies to net-next Signed-off-by: Neil Horman <nhorman@tuxdriver.com> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-22 17:35:50 -08:00
Linus Torvalds	194e57fd18	SCSI for-linus on 20140122 This patch set is a lot of driver updates for qla4xxx, bfa, hpsa, qla2xxx. It also removes the aic7xxx_old driver (which has been deprecated for nearly a decade) and adds support for deadlines in error handling. Signed-off-by: James Bottomley <JBottomley@Parallels.com> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iQEcBAABAgAGBQJS4AfxAAoJEDeqqVYsXL0M8DAIAKRtc9+hy2zsratyvHxxZlz4 6tdvT/4dLTXbIS7EFHmdVb09G5791Pj2jx2PAza/zFDSzgMyL8reXjXpDtCQBvsG U9CKAGAfJi/vD3QSez2qh0SyvIfCHjatlU3IsM/g+5iFcIOO9+LMJuRlw3qtVqOZ iQQBATH8cuVW7k1YRHXS5xAO8cVCEYBPuuz0UreVnZijTV5lsRFaObaq8r7S7yVM 4F5eSb5lcgZZLDMDMB0E2KHq12QoLiLQy/ahCqAL1od41v+P90BHC4cIu8jkdB0l nUYp8vHkmsM82L3xxK9HglcZvJ94fkz6n1gwULl2dTNSYldUnXc/xmocbt0KMf8= =yko3 -----END PGP SIGNATURE----- Merge tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI updates from James Bottomley: "This patch set is a lot of driver updates for qla4xxx, bfa, hpsa, qla2xxx. It also removes the aic7xxx_old driver (which has been deprecated for nearly a decade) and adds support for deadlines in error handling" * tag 'scsi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (75 commits) [SCSI] hpsa: allow SCSI mid layer to handle unit attention [SCSI] hpsa: do not require board "not ready" status after hard reset [SCSI] hpsa: enable unit attention reporting [SCSI] hpsa: rename scsi prefetch field [SCSI] hpsa: use workqueue instead of kernel thread for lockup detection [SCSI] ipr: increase dump size in ipr driver [SCSI] mac_scsi: Fix crash on out of memory [SCSI] st: fix enlarge_buffer [SCSI] qla1280: Annotate timer on stack so object debug does not complain [SCSI] qla4xxx: Update driver version to 5.04.00-k3 [SCSI] qla4xxx: Recreate chap data list during get chap operation [SCSI] qla4xxx: Add support for ISCSI_PARAM_LOCAL_IPADDR sysfs attr [SCSI] libiscsi: Add local_ipaddr parameter in iscsi_conn struct [SCSI] scsi_transport_iscsi: Export ISCSI_PARAM_LOCAL_IPADDR attr for iscsi_connection [SCSI] qla4xxx: Add host statistics support [SCSI] scsi_transport_iscsi: Add host statistics support [SCSI] qla4xxx: Added support for Diagnostics MBOX command [SCSI] bfa: Driver version upgrade to 3.2.23.0 [SCSI] bfa: change FC_ELS_TOV to 20sec [SCSI] bfa: Observed auto D-port mode instead of manual ...	2014-01-22 17:32:26 -08:00
Linus Torvalds	e1ba84597c	PCI changes for the v3.14 merge window: Resource management - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas) - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu) - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas) - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas) - Enforce bus address limits in resource allocation (Yinghai Lu) - Allocate 64-bit BARs above 4G when possible (Yinghai Lu) - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu) PCI device hotplug - Major rescan/remove locking update (Rafael J. Wysocki) - Make ioapic builtin only (not modular) (Yinghai Lu) - Fix release/free issues (Yinghai Lu) - Clean up pciehp (Bjorn Helgaas) - Announce pciehp slot info during enumeration (Bjorn Helgaas) MSI - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev) - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev) - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev) - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman) - Drop "irq" param from _restore_msi_irqs() (DuanZhenzhong) SR-IOV - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao) Virtualization - Add support for save/restore of extended capabilities (Alex Williamson) - Add Virtual Channel to save/restore support (Alex Williamson) - Never treat a VF as a multifunction device (Alex Williamson) - Add pci_try_reset_function(), et al (Alex Williamson) AER - Ignore non-PCIe error sources (Betty Dall) - Support ACPI HEST error sources for domains other than 0 (Betty Dall) - Consolidate HEST error source parsers (Bjorn Helgaas) - Add a TLP header print helper (Borislav Petkov) Freescale i.MX6 - Remove unnecessary code (Fabio Estevam) - Make reset-gpio optional (Marek Vasut) - Report "link up" only after link training completes (Marek Vasut) - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut) - Fix PCIe startup code (Richard Zhu) Marvell MVEBU - Remove duplicate of_clk_get_by_name() call (Andrew Lunn) - Drop writes to bridge Secondary Status register (Jason Gunthorpe) - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe) - Support a bridge with no IO port window (Jason Gunthorpe) - Use max_t() instead of max(resource_size_t,) (Jingoo Han) - Remove redundant of_match_ptr (Sachin Kamat) - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni) NVIDIA Tegra - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower) Renesas R-Car - Add runtime PM support (Valentine Barshak) - Fix rcar_pci_probe() return value check (Wei Yongjun) Synopsys DesignWare - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen) - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen) - Fix missing MSI IRQs (Harro Haan) - Add dw_pcie prefix before cfg_read/write (Pratyush Anand) - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand) - Whitespace cleanup (Jingoo Han) EISA - Call put_device() if device_register() fails (Levente Kurusa) - Revert EISA initialization breakage ((Bjorn Helgaas) Miscellaneous - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger) - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas) - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas) - Use dev_is_pci() to identify PCI devices (Yijing Wang) - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches) - Update documentation 00-INDEX (Erik Ekman) -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJS3ujEAAoJEFmIoMA60/r8A4EQAK9AZSUSVNWvlKdC1PrBfT3w 7fVILx5A4KWsOU8eoFwCPQLrgvUtMltg16yN2tbCjqpKEdrVc36biMO9bwhnXSyZ KopHKMWnn0sza/z2H8mcGy+0azGdWcIjcErX/a8WeS6zyWBjm+yzckrHNVpPu4Ca SpCBhfgBMjKyIZyLtP6juFSH34S2DfQex4oUSyPC+gjqPy5wW/xw/kBxZfOXl+yU P9pQT+geMIc31pETMdG9wd/TT+47YAui4ieSggoVxfVrphCXv6S8mOMCMuQc2bAy MHy9uFm1jbvKZZIYrzJ+9HFiiU/6MNiOO3Ygua52xuSp1Zrcjwi2CLD9/QBXbDVs pTKU5JIO7q43llkQUpIXTwBvEApSZRhuqzXegsMAYIg4AWmbfm/2fXkfWlQThYMp J48blAJZ4t0vhMr9usgwbtdBe8F5euExOxpwH0QMCMABbuu8/B3TLm39+LTcIbsw Efgm3N9iUTyiV5fe9Rr62nflhyqXjTevPl4wbZZe4OOdm0MXZY+/BzuNJhg3wyY8 QKz2J3FB6OR7BCLHCp80l50s5+Ih4F5kmOXwFKjT1D1MFRaNaPDmp9BY6TitU6hg zj55gP4c8x6n3alakbf972Yhgs/4oi1va8cZL+pCYWb8nPO5ldaMiT7QBBLUreQV BtDtC7u/AFWJ5e73+jVO =La1R -----END PGP SIGNATURE----- Merge tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "PCI changes for the v3.14 merge window: Resource management - Change pci_bus_region addresses to dma_addr_t (Bjorn Helgaas) - Support 64-bit AGP BARs (Bjorn Helgaas, Yinghai Lu) - Add pci_bus_address() to get bus address of a BAR (Bjorn Helgaas) - Use pci_resource_start() for CPU address of AGP BARs (Bjorn Helgaas) - Enforce bus address limits in resource allocation (Yinghai Lu) - Allocate 64-bit BARs above 4G when possible (Yinghai Lu) - Convert pcibios_resource_to_bus() to take pci_bus, not pci_dev (Yinghai Lu) PCI device hotplug - Major rescan/remove locking update (Rafael J. Wysocki) - Make ioapic builtin only (not modular) (Yinghai Lu) - Fix release/free issues (Yinghai Lu) - Clean up pciehp (Bjorn Helgaas) - Announce pciehp slot info during enumeration (Bjorn Helgaas) MSI - Add pci_msi_vec_count(), pci_msix_vec_count() (Alexander Gordeev) - Add pci_enable_msi_range(), pci_enable_msix_range() (Alexander Gordeev) - Deprecate "tri-state" interfaces: fail/success/fail+info (Alexander Gordeev) - Export MSI mode using attributes, not kobjects (Greg Kroah-Hartman) - Drop "irq" param from _restore_msi_irqs() (DuanZhenzhong) SR-IOV - Clear NumVFs when disabling SR-IOV in sriov_init() (ethan.zhao) Virtualization - Add support for save/restore of extended capabilities (Alex Williamson) - Add Virtual Channel to save/restore support (Alex Williamson) - Never treat a VF as a multifunction device (Alex Williamson) - Add pci_try_reset_function(), et al (Alex Williamson) AER - Ignore non-PCIe error sources (Betty Dall) - Support ACPI HEST error sources for domains other than 0 (Betty Dall) - Consolidate HEST error source parsers (Bjorn Helgaas) - Add a TLP header print helper (Borislav Petkov) Freescale i.MX6 - Remove unnecessary code (Fabio Estevam) - Make reset-gpio optional (Marek Vasut) - Report "link up" only after link training completes (Marek Vasut) - Start link in Gen1 before negotiating for Gen2 mode (Marek Vasut) - Fix PCIe startup code (Richard Zhu) Marvell MVEBU - Remove duplicate of_clk_get_by_name() call (Andrew Lunn) - Drop writes to bridge Secondary Status register (Jason Gunthorpe) - Obey bridge PCI_COMMAND_MEM and PCI_COMMAND_IO bits (Jason Gunthorpe) - Support a bridge with no IO port window (Jason Gunthorpe) - Use max_t() instead of max(resource_size_t,) (Jingoo Han) - Remove redundant of_match_ptr (Sachin Kamat) - Call pci_ioremap_io() at startup instead of dynamically (Thomas Petazzoni) NVIDIA Tegra - Disable Gen2 for Tegra20 and Tegra30 (Eric Brower) Renesas R-Car - Add runtime PM support (Valentine Barshak) - Fix rcar_pci_probe() return value check (Wei Yongjun) Synopsys DesignWare - Fix crash in dw_msi_teardown_irq() (Bjørn Erik Nilsen) - Remove redundant call to pci_write_config_word() (Bjørn Erik Nilsen) - Fix missing MSI IRQs (Harro Haan) - Add dw_pcie prefix before cfg_read/write (Pratyush Anand) - Fix I/O transfers by using CPU (not realio) address (Pratyush Anand) - Whitespace cleanup (Jingoo Han) EISA - Call put_device() if device_register() fails (Levente Kurusa) - Revert EISA initialization breakage ((Bjorn Helgaas) Miscellaneous - Remove unused code, including PCIe 3.0 interfaces (Stephen Hemminger) - Prevent bus conflicts while checking for bridge apertures (Bjorn Helgaas) - Stop clearing bridge Secondary Status when setting up I/O aperture (Bjorn Helgaas) - Use dev_is_pci() to identify PCI devices (Yijing Wang) - Deprecate DEFINE_PCI_DEVICE_TABLE (Joe Perches) - Update documentation 00-INDEX (Erik Ekman)" * tag 'pci-v3.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (119 commits) Revert "EISA: Initialize device before its resources" Revert "EISA: Log device resources in dmesg" vfio-pci: Use pci "try" reset interface PCI: Check parent kobject in pci_destroy_dev() xen/pcifront: Use global PCI rescan-remove locking powerpc/eeh: Use global PCI rescan-remove locking PCI: Fix pci_check_and_unmask_intx() comment typos PCI: Add pci_try_reset_function(), pci_try_reset_slot(), pci_try_reset_bus() MPT / PCI: Use pci_stop_and_remove_bus_device_locked() platform / x86: Use global PCI rescan-remove locking PCI: hotplug: Use global PCI rescan-remove locking pcmcia: Use global PCI rescan-remove locking ACPI / hotplug / PCI: Use global PCI rescan-remove locking ACPI / PCI: Use global PCI rescan-remove locking in PCI root hotplug PCI: Add global pci_lock_rescan_remove() PCI: Cleanup pci.h whitespace PCI: Reorder so actual code comes before stubs PCI/AER: Support ACPI HEST AER error sources for PCI domains other than 0 ACPICA: Add helper macros to extract bus/segment numbers from HEST table. PCI: Make local functions static ...	2014-01-22 16:39:28 -08:00
Linus Torvalds	60eaa0190f	This pull request has a new feature to ftrace, namely the trace event triggers by Tom Zanussi. A trigger is a way to enable an action when an event is hit. The actions are: o trace on/off - enable or disable tracing o snapshot - save the current trace buffer in the snapshot o stacktrace - dump the current stack trace to the ringbuffer o enable/disable events - enable or disable another event Namhyung Kim added updates to the tracing uprobes code. Having the uprobes add support for fetch methods. The rest are various bug fixes with the new code, and minor ones for the old code. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQEcBAABAgAGBQJS3Z9fAAoJEKQekfcNnQGuFf0H/0CteaN+BJjpif6Tnxia15Sp pcftzU0lgqfNzsfitmbjiVTgXWqCghoZo8UI9tQZvBZ9wmDIxeXQR73uoBgVlSCQ ovyBO/R8r+lq+7EsDCwntZvrLbcdn6s/jzoruRvt7r35ghK5pH81DNR1BOzTQBhW x+361Xtc13aok7N7JN8KR96VDUP9f8KU6PWqJ5lgS2Zl+wbVw6b0p8OV8IMCHczP MdYrx8y4Jv4QWW7rMShAAVBe9qJQ56JWiWA17ysa4kY8BkKQ7QtlEFr+r1YY0nX5 67brXiL8u0NFzRx5y2VRpGc25BbImnVBFpoLQ5Itluq9OdZE3aOQubzXlY70R6g= =Hkho -----END PGP SIGNATURE----- Merge tag 'trace-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "This pull request has a new feature to ftrace, namely the trace event triggers by Tom Zanussi. A trigger is a way to enable an action when an event is hit. The actions are: o trace on/off - enable or disable tracing o snapshot - save the current trace buffer in the snapshot o stacktrace - dump the current stack trace to the ringbuffer o enable/disable events - enable or disable another event Namhyung Kim added updates to the tracing uprobes code. Having the uprobes add support for fetch methods. The rest are various bug fixes with the new code, and minor ones for the old code" * tag 'trace-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (38 commits) tracing: Fix buggered tee(2) on tracing_pipe tracing: Have trace buffer point back to trace_array ftrace: Fix synchronization location disabling and freeing ftrace_ops ftrace: Have function graph only trace based on global_ops filters ftrace: Synchronize setting function_trace_op with ftrace_trace_function tracing: Show available event triggers when no trigger is set tracing: Consolidate event trigger code tracing: Fix counter for traceon/off event triggers tracing: Remove double-underscore naming in syscall trigger invocations tracing/kprobes: Add trace event trigger invocations tracing/probes: Fix build break on !CONFIG_KPROBE_EVENT tracing/uprobes: Add @+file_offset fetch method uprobes: Allocate ->utask before handler_chain() for tracing handlers tracing/uprobes: Add support for full argument access methods tracing/uprobes: Fetch args before reserving a ring buffer tracing/uprobes: Pass 'is_return' to traceprobe_parse_probe_arg() tracing/probes: Implement 'memory' fetch method for uprobes tracing/probes: Add fetch{,_size} member into deref fetch method tracing/probes: Move 'symbol' fetch method to kprobes tracing/probes: Implement 'stack' fetch method for uprobes ...	2014-01-22 16:35:21 -08:00
Miklos Szeredi	28a625cbc2	fuse: fix pipe_buf_operations Having this struct in module memory could Oops when if the module is unloaded while the buffer still persists in a pipe. Since sock_pipe_buf_ops is essentially the same as fuse_dev_pipe_buf_steal merge them into nosteal_pipe_buf_ops (this is the same as default_pipe_buf_ops except stealing the page from the buffer is not allowed). Reported-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org	2014-01-22 19:36:57 +01:00
Li Zhong	c04e7da013	neighbour.h: fix comment Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2014-01-22 10:34:15 +01:00
Masanari Iida	f434f7afa5	sched: Fix warning on make htmldocs caused by wait.h Missing "@" in include/linux/wait.h cause "make htmldocs" failed with following warning messages. Warning(/home/iida/Repo/linux-next//include/linux/wait.h:304): No description found for parameter 'cmd1' Warning(/home/iida/Repo/linux-next//include/linux/wait.h:304): No description found for parameter 'cmd2' Signed-off-by: Masanari Iida <standby24x7@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2014-01-22 10:25:39 +01:00
Hannes Frederic Sowa	809fa972fd	reciprocal_divide: update/correction of the algorithm Jakub Zawadzki noticed that some divisions by reciprocal_divide() were not correct [1][2], which he could also show with BPF code after divisions are transformed into reciprocal_value() for runtime invariance which can be passed to reciprocal_divide() later on; reverse in BPF dump ended up with a different, off-by-one K in some situations. This has been fixed by Eric Dumazet in commit `aee636c480` ("bpf: do not use reciprocal divide"). This follow-up patch improves reciprocal_value() and reciprocal_divide() to work in all cases by using Granlund and Montgomery method, so that also future use is safe and without any non-obvious side-effects. Known problems with the old implementation were that division by 1 always returned 0 and some off-by-ones when the dividend and divisor where very large. This seemed to not be problematic with its current users, as far as we can tell. Eric Dumazet checked for the slab usage, we cannot surely say so in the case of flex_array. Still, in order to fix that, we propose an extension from the original implementation from commit `6a2d7a955d` resp. [3][4], by using the algorithm proposed in "Division by Invariant Integers Using Multiplication" [5], Torbjörn Granlund and Peter L. Montgomery, that is, pseudocode for q = n/d where q, n, d is in u32 universe: 1) Initialization: int l = ceil(log_2 d) uword m' = floor((1<<32)((1<<l)-d)/d)+1 int sh_1 = min(l,1) int sh_2 = max(l-1,0) 2) For q = n/d, all uword: uword t = (nm')>>32 q = (t+((n-t)>>sh_1))>>sh_2 The assembler implementation from Agner Fog [6] also helped a lot while implementing. We have tested the implementation on x86_64, ppc64, i686, s390x; on x86_64/haswell we're still half the latency compared to normal divide. Joint work with Daniel Borkmann. [1] http://www.wireshark.org/~darkjames/reciprocal-buggy.c [2] http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c [3] https://gmplib.org/~tege/division-paper.pdf [4] http://homepage.cs.uiowa.edu/~jones/bcd/divide.html [5] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.2556 [6] http://www.agner.org/optimize/asmlib.zip Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Austin S Hemmelgarn <ahferroin7@gmail.com> Cc: linux-kernel@vger.kernel.org Cc: Jesse Gross <jesse@nicira.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 23:17:20 -08:00
Daniel Borkmann	89770b0a69	net: introduce reciprocal_scale helper and convert users As David Laight suggests, we shouldn't necessarily call this reciprocal_divide() when users didn't requested a reciprocal_value(); lets keep the basic idea and call it reciprocal_scale(). More background information on this topic can be found in [1]. Joint work with Hannes Frederic Sowa. [1] http://homepage.cs.uiowa.edu/~jones/bcd/divide.html Suggested-by: David Laight <david.laight@aculab.com> Cc: Jakub Zawadzki <darkjames-ws@darkjames.pl> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 23:17:20 -08:00
Daniel Borkmann	f337db64af	random32: add prandom_u32_max and convert open coded users Many functions have open coded a function that returns a random number in range [0,N-1]. Under the assumption that we have a PRNG such as taus113 with being well distributed in [0, ~0U] space, we can implement such a function as uword t = (n*m')>>32, where m' is a random number obtained from PRNG, n the right open interval border and t our resulting random number, with n,m',t in u32 universe. Lets go with Joe and simply call it prandom_u32_max(), although technically we have an right open interval endpoint, but that we have documented. Other users can further be migrated to the new prandom_u32_max() function later on; for now, we need to make sure to migrate reciprocal_divide() users for the reciprocal_divide() follow-up fixup since their function signatures are going to change. Joint work with Hannes Frederic Sowa. Cc: Jakub Zawadzki <darkjames-ws@darkjames.pl> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: linux-kernel@vger.kernel.org Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 23:17:20 -08:00
Dongmao Zhang	5066a4df1f	dm log userspace: allow mark requests to piggyback on flush requests In the cluster evironment, cluster write has poor performance because userspace_flush() has to contact a userspace program (cmirrord) for clear/mark/flush requests. But both mark and flush requests require cmirrord to communicate the message to all the cluster nodes for each flush call. This behaviour is really slow. To address this we now merge mark and flush requests together to reduce the kernel-userspace-kernel time. We allow a new directive, "integrated_flush" that can be used to instruct the kernel log code to combine flush and mark requests when directed by userspace. If not directed by userspace (due to an older version of the userspace code perhaps), the kernel will function as it did previously - preserving backwards compatibility. Additionally, flush requests are performed lazily when only clear requests exist. Signed-off-by: Dongmao Zhang <dmzhang@suse.com> Signed-off-by: Jonathan Brassow <jbrassow@redhat.com> Signed-off-by: Alasdair G Kergon <agk@redhat.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>	2014-01-21 23:46:27 -05:00
Linus Torvalds	df32e43a54	Merge branch 'akpm' (incoming from Andrew) Merge first patch-bomb from Andrew Morton: - a couple of misc things - inotify/fsnotify work from Jan - ocfs2 updates (partial) - about half of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (117 commits) mm/migrate: remove unused function, fail_migrate_page() mm/migrate: remove putback_lru_pages, fix comment on putback_movable_pages mm/migrate: correct failure handling if !hugepage_migration_support() mm/migrate: add comment about permanent failure path mm, page_alloc: warn for non-blockable __GFP_NOFAIL allocation failure mm: compaction: reset scanner positions immediately when they meet mm: compaction: do not mark unmovable pageblocks as skipped in async compaction mm: compaction: detect when scanners meet in isolate_freepages mm: compaction: reset cached scanner pfn's before reading them mm: compaction: encapsulate defer reset logic mm: compaction: trace compaction begin and end memcg, oom: lock mem_cgroup_print_oom_info sched: add tracepoints related to NUMA task migration mm: numa: do not automatically migrate KSM pages mm: numa: trace tasks that fail migration due to rate limiting mm: numa: limit scope of lock for NUMA migrate rate limiting mm: numa: make NUMA-migrate related functions static lib/show_mem.c: show num_poisoned_pages when oom mm/hwpoison: add '#' to hwpoison_inject mm/memblock: use WARN_ONCE when MAX_NUMNODES passed as input parameter ...	2014-01-21 19:05:45 -08:00
Daniel Borkmann	3769229931	net: filter: let bpf_tell_extensions return SKF_AD_MAX Michal Sekletar added in commit `ea02f9411d` ("net: introduce SO_BPF_EXTENSIONS") a facility where user space can enquire the BPF ancillary instruction set, which is imho a step into the right direction for letting user space high-level to BPF optimizers make an informed decision for possibly using these extensions. The original rationale was to return through a getsockopt(2) a bitfield of which instructions are supported and which are not, as of right now, we just return 0 to indicate a base support for SKF_AD_PROTOCOL up to SKF_AD_PAY_OFFSET. Limitations of this approach are that this API which we need to maintain for a long time can only support a maximum of 32 extensions, and needs to be additionally maintained/updated when each new extension that comes in. I thought about this a bit more and what we can do here to overcome this is to just return SKF_AD_MAX. Since we never remove any extension since we cannot break user space and always linearly increase SKF_AD_MAX on each newly added extension, user space can make a decision on what extensions are supported in the whole set of extensions and which aren't, by just checking which of them from the whole set have an offset < SKF_AD_MAX of the underlying kernel. Since SKF_AD_MAX must be updated each time we add new ones, we don't need to introduce an additional enum and got maintenance for free. At some point in time when SO_BPF_EXTENSIONS becomes ubiquitous for most kernels, then an application can simply make use of this and easily be run on newer or older underlying kernels without needing to be recompiled, of course. Since that is for 3.14, it's not too late to do this change. Cc: Michal Sekletar <msekleta@redhat.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Acked-by: Michal Sekletar <msekleta@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:57:43 -08:00
wangweidong	5bc1d1b4a2	sctp: remove macros sctp_bh_[un]lock_sock Redefined bh_[un]lock_sock to sctp_bh[un]lock_sock for user space friendly code which we haven't use in years, so removing them. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:41:36 -08:00
wangweidong	048ed4b626	sctp: remove macros sctp_{lock\|release}_sock Redefined {lock\|release}_sock to sctp_{lock\|release}_sock for user space friendly code which we haven't use in years, so removing them. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:41:36 -08:00
wangweidong	1b0de194f1	sctp: remove macros sctp_read_[un]lock Redefined read_[un]lock to sctp_read_[un]lock for user space friendly code which we haven't use in years, and the macros we never used, so removing them. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:40:41 -08:00
wangweidong	387602dfdc	sctp: remove macros sctp_write_[un]_lock Redefined write_[un]lock to sctp_write_[un]lock for user space friendly code which we haven't use in years, so removing them. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:40:41 -08:00
wangweidong	3c8e43ba9f	sctp: remove macros sctp_spin_[un]lock Redefined spin_[un]lock to sctp_spin_[un]lock for user space friendly code which we haven't use in years, so removing them. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:40:41 -08:00
wangweidong	79b91130a2	sctp: remove macros sctp_local_bh_{disable\|enable} Redefined local_bh_{disable\|enable} to sctp_local_bh_{disable\|enable} for user space friendly code which we haven't use in years, so removing them. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:40:40 -08:00
wangweidong	940287ee10	sctp: remove macros sctp_spin_[un]lock_irqrestore Redefined spin_[un]lock_irqstore to sctp_spin_[un]lock_irqrestore for user space friendly code which we haven't use in years, so removing them. Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:40:40 -08:00
Linus Torvalds	fbd918a202	Merge branch 'for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata updates from Tejun Heo: "Support for some new embedded controllers. A couple late (<= a week) fixes have stable cc'd and one patch ("SATA: MV: Add support for the optional PHYs") got committed yesterday because otherwise the resulting kernel would fail boot on an embedded board due to interdependent changes in its platform tree. Other than that, nothing too noteworthy" * 'for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: SATA: MV: Add support for the optional PHYs sata-highbank: Remove unnecessary ahci_platform.h include libata: disable LPM for some WD SATA-I devices ARM: mvebu: update the SATA compatible string for Armada 370/XP ata: sata_mv: fix disk hotplug for Armada 370/XP SoCs ata: sata_mv: introduce compatible string "marvell, armada-370-sata" ata: pata_samsung_cf: Remove unused macros ata: pata_samsung_cf: Use devm_ioremap_resource() ata: pata_samsung_cf: Merge pata_samsung_cf.h into pata_samsung_cf.c ata: pata_samsung_cf: Move plat/regs-ata.h to drivers/ata drivers: ata: Mark the function as static in libahci.c drivers: ata: Mark the function ahci_init_interrupts() as static in ahci.c ahci: imx: fix the error handling in imx_ahci_probe() ahci: imx: ahci_imx_softreset() can be static ahci: imx: Add i.MX53 support ahci: imx: Pull out the clock enable/disable calls libata, dt: Document sata_rcar bindings sata_rcar: Add R-Car Gen2 SATA PHY support ahci: mcp89: enter AHCI mode under Apple BIOS emulation ata: libata-eh: Remove unnecessary snprintf arithmetic	2014-01-21 18:16:08 -08:00
Or Gerlitz	dc01e7d344	net: Add GRO support for vxlan traffic Add GRO handlers for vxlann, by using the UDP GRO infrastructure. For single TCP session that goes through vxlan tunneling I got nice improvement from 6.8Gbs to 11.5Gbs --> UDP/VXLAN GRO disabled $ netperf -H 192.168.52.147 -c -C $ netperf -t TCP_STREAM -H 192.168.52.147 -c -C MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 65536 65536 10.00 6799.75 12.54 24.79 0.604 1.195 --> UDP/VXLAN GRO enabled $ netperf -t TCP_STREAM -H 192.168.52.147 -c -C MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.52.147 () port 0 AF_INET Recv Send Send Utilization Service Demand Socket Socket Message Elapsed Send Recv Send Recv Size Size Size Time Throughput local remote local remote bytes bytes bytes secs. 10^6bits/s % S % S us/KB us/KB 87380 65536 65536 10.00 11562.72 24.90 20.34 0.706 0.577 Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:05:04 -08:00
Or Gerlitz	b582ef0990	net: Add GRO support for UDP encapsulating protocols Add GRO handlers for protocols that do UDP encapsulation, with the intent of being able to coalesce packets which encapsulate packets belonging to the same TCP session. For GRO purposes, the destination UDP port takes the role of the ether type field in the ethernet header or the next protocol in the IP header. The UDP GRO handler will only attempt to coalesce packets whose destination port is registered to have gro handler. Use a mark on the skb GRO CB data to disallow (flush) running the udp gro receive code twice on a packet. This solves the problem of udp encapsulated packets whose inner VM packet is udp and happen to carry a port which has registered offloads. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 18:05:04 -08:00
Linus Torvalds	f075e0f699	Merge branch 'for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "The bulk of changes are cleanups and preparations for the upcoming kernfs conversion. - cgroup_event mechanism which is and will be used only by memcg is moved to memcg. - pidlist handling is updated so that it can be served by seq_file. Also, the list is not sorted if sane_behavior. cgroup documentation explicitly states that the file is not sorted but it has been for quite some time. - All cgroup file handling now happens on top of seq_file. This is to prepare for kernfs conversion. In addition, all operations are restructured so that they map 1-1 to kernfs operations. - Other cleanups and low-pri fixes" * 'for-3.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (40 commits) cgroup: trivial style updates cgroup: remove stray references to css_id doc: cgroups: Fix typo in doc/cgroups cgroup: fix fail path in cgroup_load_subsys() cgroup: fix missing unlock on error in cgroup_load_subsys() cgroup: remove for_each_root_subsys() cgroup: implement for_each_css() cgroup: factor out cgroup_subsys_state creation into create_css() cgroup: combine css handling loops in cgroup_create() cgroup: reorder operations in cgroup_create() cgroup: make for_each_subsys() useable under cgroup_root_mutex cgroup: css iterations and css_from_dir() are safe under cgroup_mutex cgroup: unify pidlist and other file handling cgroup: replace cftype->read_seq_string() with cftype->seq_show() cgroup: attach cgroup_open_file to all cgroup files cgroup: generalize cgroup_pidlist_open_file cgroup: unify read path so that seq_file is always used cgroup: unify cgroup_write_X64() and cgroup_write_string() cgroup: remove cftype->read(), ->read_map() and ->write() hugetlb_cgroup: convert away from cftype->read() ...	2014-01-21 17:51:34 -08:00
Linus Torvalds	2182c815f3	The main topics this time are allocation, in the form of Bob's improvements when searching resource groups and several updates to quotas which should increase scalability. The quota changes follow on from those in the last merge window, and there will likely be further work to come in this area in due course. There are also a few patches which help to improve efficiency of adding entries into directories, and clean up some of that code. One on-disk change is included this time, which is to write some additional information which should be useful to fsck and also potentially for debugging. Other than that, its just a few small random bug fixes and clean ups. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.15 (GNU/Linux) iQIcBAABAgAGBQJS3RZUAAoJEMrg3m4a/8jS/C4QAKAoGbGH8UxG2fNRUWK94UBL lxxn0nUaQ7bwKpIbysEO6HHg2rqcfjRSdO/H6kfD51Fgg39RfIfvXF9woJw/5Is6 1xcn3faBl3dQhIq81+lSKnEJnJbQ49dbIcgxt1FGqPI8ZY6st/CBvnvqh0TRAcvN 7t0TYSPxmA/7BnwynstTbpK2l9uUisau8z12QzihJ6lRiR7QlfFSUq6Xs/ICSmkE LE2dC6gXMIIDyMpPX5Fg9NQNEAPKOzMwduUaHnv2vpuh1jkk3ObHg4mUkYtdoUZx Hbs1Ey5oqRiMs03I5wm1ayZg+6i/pT7HTM5vbrunTa21AMmJ0NTFdZ4ihVZqZwu4 ynpXY0OT/U7CpFyx6FSKtlq0kzwx5A51mL9bpSlLU9pggJY6NHQdmSJ9fdLW9cdS Wwi2Br/0rBvIhm0ejy8kxK904DhqCGsH+RXbfOc8J4VBE4R/z5eMmBG9UqMm7c+7 hO6FhpJo9QsNMe19A8IANzG2gT+PcpgrXs2GTS1QFGhFgn08+k3RLDT/SrGRxU6r MaCM9rms0SgMr2LX6MEPZdo3LJgBTNr1rKDEIzQ7BUXs3FHJ7U0swuHOTDXFVm51 gQanPYmVvFV7aiom3ExUhegDefBu1jMnY26SVxxLAbWN1Ek2Ikg6rO6wx3ckXvYo SKY3qAewE5pNpHe7u6SS =/tv0 -----END PGP SIGNATURE----- Merge tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw Pull GFS2 updates from Steven Whitehouse: "The main topics this time are allocation, in the form of Bob's improvements when searching resource groups and several updates to quotas which should increase scalability. The quota changes follow on from those in the last merge window, and there will likely be further work to come in this area in due course. There are also a few patches which help to improve efficiency of adding entries into directories, and clean up some of that code. One on-disk change is included this time, which is to write some additional information which should be useful to fsck and also potentially for debugging. Other than that, its just a few small random bug fixes and clean ups" * tag 'gfs2-merge-window' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-nmw: (24 commits) GFS2: revert "GFS2: d_splice_alias() can't return error" GFS2: Small cleanup GFS2: Don't use ENOBUFS when ENOMEM is the correct error code GFS2: Fix kbuild test robot reported warning GFS2: Move quota bitmap operations under their own lock GFS2: Clean up quota slot allocation GFS2: Only run logd and quota when mounted read/write GFS2: Use RCU/hlist_bl based hash for quotas GFS2: No need to invalidate pages for a dio read GFS2: Add initialization for address space in super block GFS2: Add hints to directory leaf blocks GFS2: For exhash conversion, only one block is needed GFS2: Increase i_writecount during gfs2_setattr_chown GFS2: Remember directory insert point GFS2: Consolidate transaction blocks calculation for dir add GFS2: Add directory addition info structure GFS2: Use only a single address space for rgrps GFS2: Use range based functions for rgrp sync/invalidation GFS2: Remove test which is always true GFS2: Remove gfs2_quota_change_host structure ...	2014-01-21 17:42:00 -08:00
Vince Bridgers	2618abb73c	stmmac: Fix kernel crashes for jumbo frames These changes correct the following issues with jumbo frames on the stmmac driver: 1) The Synopsys EMAC can be configured to support different FIFO sizes at core configuration time. There's no way to query the controller and know the FIFO size, so the driver needs to get this information from the device tree in order to know how to correctly handle MTU changes and setting up dma buffers. The default max-frame-size is as currently used, which is the size of a jumbo frame. 2) The driver was enabling Jumbo frames by default, but was not allocating dma buffers of sufficient size to handle the maximum possible packet size that could be received. This led to memory corruption since DMAs were occurring beyond the extent of the allocated receive buffers for certain types of network traffic. kernel BUG at net/core/skbuff.c:126! Internal error: Oops - BUG: 0 [#1] SMP ARM Modules linked in: CPU: 0 PID: 563 Comm: sockperf Not tainted 3.13.0-rc6-01523-gf7111b9 #31 task: ef35e580 ti: ef252000 task.ti: ef252000 PC is at skb_panic+0x60/0x64 LR is at skb_panic+0x60/0x64 pc : [<c03c7c3c>] lr : [<c03c7c3c>] psr: 60000113 sp : ef253c18 ip : 60000113 fp : 00000000 r10: ef3a5400 r9 : 00000ebc r8 : ef3a546c r7 : ee59f000 r6 : ee59f084 r5 : ee59ff40 r4 : ee59f140 r3 : 000003e2 r2 : 00000007 r1 : c0b9c420 r0 : 0000007d Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user Control: 10c5387d Table: 2e8ac04a DAC: 00000015 Process sockperf (pid: 563, stack limit = 0xef252248) Stack: (0xef253c18 to 0xef254000) 3c00: 00000ebc ee59f000 3c20: ee59f084 ee59ff40 ee59f140 c04a9cd8 ee8c50c0 00000ebc ee59ff40 00000000 3c40: ee59f140 c02d0ef0 00000056 ef1eda80 ee8c50c0 00000ebc 22bbef29 c0318f8c 3c60: 00000056 ef3a547c ffe2c716 c02c9c90 c0ba1298 ef3a5838 ef3a5838 ef3a5400 3c80: 000020c0 ee573840 000055cb ef3f2050 c053f0e0 c0319214 22b9b085 22d92813 3ca0: 00001c80 004b8e00 ef3a5400 ee573840 ef3f2064 22d92813 ef3f2064 000055cb 3cc0: ef3f2050 c031a19c ef252000 00000000 00000000 c0561bc0 00000000 ff00ffff 3ce0: c05621c0 ef3a5400 ef3f2064 ee573840 00000020 ef3f2064 000055cb ef3f2050 3d00: c053f0e0 c031cad0 c053e740 00000e60 00000000 00000000 ee573840 ef3a5400 3d20: ef0a6e00 00000000 ef3f2064 c032507c 00010000 00000020 c0561bc0 c0561bc0 3d40: ee599850 c032799c 00000000 ee573840 c055a380 ef3a5400 00000000 ef3f2064 3d60: ef3f2050 c032799c 0101c7c0 2b6755cb c059a280 c030e4d8 000055cb ffffffff 3d80: ee574fc0 c055a380 ee574000 ee573840 00002b67 ee573840 c03fe9c4 c053fa68 3da0: c055a380 00001f6f 00000000 ee573840 c053f0e0 c0304fdc ef0a6e01 ef3f2050 3dc0: ee573858 ef031000 ee573840 c03055d8 c0ba0c40 ef000f40 00100100 c053f0dc 3de0: c053ffdc c053f0f0 00000008 00000000 ef031000 c02da948 00001140 00000000 3e00: c0563c78 ef253e5f 00000020 ee573840 00000020 c053f0f0 ef313400 ee573840 3e20: c053f0e0 00000000 00000000 c05380c0 ef313400 00001000 00000015 c02df280 3e40: ee574000 ef001e00 00000000 00001080 00000042 005cd980 ef031500 ef031500 3e60: 00000000 c02df824 ef031500 c053e390 c0541084 f00b1e00 c05925e8 c02df864 3e80: 00001f5c ef031440 c053e390 c0278524 00000002 00000000 c0b9eb48 c02df280 3ea0: ee8c7180 00000100 c0542ca8 00000015 00000040 ef031500 ef031500 ef031500 3ec0: c027803c ef252000 00000040 000000ec c05380c0 c0b9eb40 c0b9eb48 c02df940 3ee0: ef060780 ffffa4dd c0564a9c c056343c 002e80a8 00000080 ef031500 00000001 3f00: c053808c ef252000 fffec100 00000003 00000004 002e80a8 0000000c c00258f0 3f20: 002e80a8 c005e704 00000005 00000100 c05634d0 c0538080 c05333e0 00000000 3f40: 0000000a c0565580 c05380c0 ffffa4dc c05434f4 00400100 00000004 c0534cd4 3f60: 00000098 00000000 fffec100 002e80a8 00000004 002e80a8 002a20e0 c0025da8 3f80: c0534cd4 c000f020 fffec10c c053ea60 ef253fb0 c0008530 0000ffe2 b6ef67f4 3fa0: 40000010 ffffffff 00000124 c0012f3c 0000ffe2 002e80f0 0000ffe2 00004000 3fc0: becb6338 becb6334 00000004 00000124 002e80a8 00000004 002e80a8 002a20e0 3fe0: becb6300 becb62f4 002773bb b6ef67f4 40000010 ffffffff 00000000 00000000 [<c03c7c3c>] (skb_panic+0x60/0x64) from [<c02d0ef0>] (skb_put+0x4c/0x50) [<c02d0ef0>] (skb_put+0x4c/0x50) from [<c0318f8c>] (tcp_collapse+0x314/0x3ec) [<c0318f8c>] (tcp_collapse+0x314/0x3ec) from [<c0319214>] (tcp_try_rmem_schedule+0x1b0/0x3c4) [<c0319214>] (tcp_try_rmem_schedule+0x1b0/0x3c4) from [<c031a19c>] (tcp_data_queue+0x480/0xe6c) [<c031a19c>] (tcp_data_queue+0x480/0xe6c) from [<c031cad0>] (tcp_rcv_established+0x180/0x62c) [<c031cad0>] (tcp_rcv_established+0x180/0x62c) from [<c032507c>] (tcp_v4_do_rcv+0x13c/0x31c) [<c032507c>] (tcp_v4_do_rcv+0x13c/0x31c) from [<c032799c>] (tcp_v4_rcv+0x718/0x73c) [<c032799c>] (tcp_v4_rcv+0x718/0x73c) from [<c0304fdc>] (ip_local_deliver+0x98/0x274) [<c0304fdc>] (ip_local_deliver+0x98/0x274) from [<c03055d8>] (ip_rcv+0x420/0x758) [<c03055d8>] (ip_rcv+0x420/0x758) from [<c02da948>] (__netif_receive_skb_core+0x44c/0x5bc) [<c02da948>] (__netif_receive_skb_core+0x44c/0x5bc) from [<c02df280>] (netif_receive_skb+0x48/0xb4) [<c02df280>] (netif_receive_skb+0x48/0xb4) from [<c02df824>] (napi_gro_flush+0x70/0x94) [<c02df824>] (napi_gro_flush+0x70/0x94) from [<c02df864>] (napi_complete+0x1c/0x34) [<c02df864>] (napi_complete+0x1c/0x34) from [<c0278524>] (stmmac_poll+0x4e8/0x5c8) [<c0278524>] (stmmac_poll+0x4e8/0x5c8) from [<c02df940>] (net_rx_action+0xc4/0x1e4) [<c02df940>] (net_rx_action+0xc4/0x1e4) from [<c00258f0>] (__do_softirq+0x12c/0x2e8) [<c00258f0>] (__do_softirq+0x12c/0x2e8) from [<c0025da8>] (irq_exit+0x78/0xac) [<c0025da8>] (irq_exit+0x78/0xac) from [<c000f020>] (handle_IRQ+0x44/0x90) [<c000f020>] (handle_IRQ+0x44/0x90) from [<c0008530>] (gic_handle_irq+0x2c/0x5c) [<c0008530>] (gic_handle_irq+0x2c/0x5c) from [<c0012f3c>] (__irq_usr+0x3c/0x60) 3) The driver was setting the dma buffer size after allocating dma buffers, which caused a system panic when changing the MTU. BUG: Bad page state in process ifconfig pfn:2e850 page:c0b72a00 count:0 mapcount:0 mapping: (null) index:0x0 page flags: 0x200(arch_1) Modules linked in: CPU: 0 PID: 566 Comm: ifconfig Not tainted 3.13.0-rc6-01523-gf7111b9 #29 [<c001547c>] (unwind_backtrace+0x0/0xf8) from [<c00122dc>] (show_stack+0x10/0x14) [<c00122dc>] (show_stack+0x10/0x14) from [<c03c793c>] (dump_stack+0x70/0x88) [<c03c793c>] (dump_stack+0x70/0x88) from [<c00b2620>] (bad_page+0xc8/0x118) [<c00b2620>] (bad_page+0xc8/0x118) from [<c00b302c>] (get_page_from_freelist+0x744/0x870) [<c00b302c>] (get_page_from_freelist+0x744/0x870) from [<c00b40f4>] (__alloc_pages_nodemask+0x118/0x86c) [<c00b40f4>] (__alloc_pages_nodemask+0x118/0x86c) from [<c00b4858>] (__get_free_pages+0x10/0x54) [<c00b4858>] (__get_free_pages+0x10/0x54) from [<c00cba1c>] (kmalloc_order_trace+0x24/0xa0) [<c00cba1c>] (kmalloc_order_trace+0x24/0xa0) from [<c02d199c>] (__kmalloc_reserve.isra.21+0x24/0x70) [<c02d199c>] (__kmalloc_reserve.isra.21+0x24/0x70) from [<c02d240c>] (__alloc_skb+0x68/0x13c) [<c02d240c>] (__alloc_skb+0x68/0x13c) from [<c02d3930>] (__netdev_alloc_skb+0x3c/0xe8) [<c02d3930>] (__netdev_alloc_skb+0x3c/0xe8) from [<c0279378>] (stmmac_open+0x63c/0x1024) [<c0279378>] (stmmac_open+0x63c/0x1024) from [<c02e18cc>] (__dev_open+0xa0/0xfc) [<c02e18cc>] (__dev_open+0xa0/0xfc) from [<c02e1b40>] (__dev_change_flags+0x94/0x158) [<c02e1b40>] (__dev_change_flags+0x94/0x158) from [<c02e1c24>] (dev_change_flags+0x18/0x48) [<c02e1c24>] (dev_change_flags+0x18/0x48) from [<c0337bc0>] (devinet_ioctl+0x638/0x700) [<c0337bc0>] (devinet_ioctl+0x638/0x700) from [<c02c7aec>] (sock_ioctl+0x64/0x290) [<c02c7aec>] (sock_ioctl+0x64/0x290) from [<c0100890>] (do_vfs_ioctl+0x78/0x5b8) [<c0100890>] (do_vfs_ioctl+0x78/0x5b8) from [<c0100e0c>] (SyS_ioctl+0x3c/0x5c) [<c0100e0c>] (SyS_ioctl+0x3c/0x5c) from [<c000e760>] The fixes have been verified using reproducible, automated testing. Signed-off-by: Vince Bridgers <vbridgers2013@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 17:05:27 -08:00
Hannes Frederic Sowa	82b276cd2b	ipv6: protect protocols not handling ipv4 from v4 connection/bind attempts Some ipv6 protocols cannot handle ipv4 addresses, so we must not allow connecting and binding to them. sendmsg logic does already check msg->name for this but must trust already connected sockets which could be set up for connection to ipv4 address family. Per-socket flag ipv6only is of no use here, as it is under users control by setsockopt. Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2014-01-21 16:59:19 -08:00
Joonsoo Kim	78d5506e82	mm/migrate: remove unused function, fail_migrate_page() fail_migrate_page() isn't used anywhere, so remove it. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Christoph Lameter <cl@linux.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:49 -08:00
Joonsoo Kim	59c82b70dc	mm/migrate: remove putback_lru_pages, fix comment on putback_movable_pages Some part of putback_lru_pages() and putback_movable_pages() is duplicated, so it could confuse us what we should use. We can remove putback_lru_pages() since it is not really needed now. This makes us undestand and maintain the code more easily. And comment on putback_movable_pages() is stale now, so fix it. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Rafael Aquini <aquini@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:49 -08:00
Vlastimil Babka	de6c60a6c1	mm: compaction: encapsulate defer reset logic Currently there are several functions to manipulate the deferred compaction state variables. The remaining case where the variables are touched directly is when a successful allocation occurs in direct compaction, or is expected to be successful in the future by kswapd. Here, the lowest order that is expected to fail is updated, and in the case of successful allocation, the deferred status and counter is reset completely. Create a new function compaction_defer_reset() to encapsulate this functionality and make it easier to understand the code. No functional change. Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:48 -08:00
Mel Gorman	0eb927c0ab	mm: compaction: trace compaction begin and end The broad goal of the series is to improve allocation success rates for huge pages through memory compaction, while trying not to increase the compaction overhead. The original objective was to reintroduce capturing of high-order pages freed by the compaction, before they are split by concurrent activity. However, several bugs and opportunities for simple improvements were found in the current implementation, mostly through extra tracepoints (which are however too ugly for now to be considered for sending). The patches mostly deal with two mechanisms that reduce compaction overhead, which is caching the progress of migrate and free scanners, and marking pageblocks where isolation failed to be skipped during further scans. Patch 1 (from mgorman) adds tracepoints that allow calculate time spent in compaction and potentially debug scanner pfn values. Patch 2 encapsulates the some functionality for handling deferred compactions for better maintainability, without a functional change type is not determined without being actually needed. Patch 3 fixes a bug where cached scanner pfn's are sometimes reset only after they have been read to initialize a compaction run. Patch 4 fixes a bug where scanners meeting is sometimes not properly detected and can lead to multiple compaction attempts quitting early without doing any work. Patch 5 improves the chances of sync compaction to process pageblocks that async compaction has skipped due to being !MIGRATE_MOVABLE. Patch 6 improves the chances of sync direct compaction to actually do anything when called after async compaction fails during allocation slowpath. The impact of patches were validated using mmtests's stress-highalloc benchmark with mmtests's stress-highalloc benchmark on a x86_64 machine with 4GB memory. Due to instability of the results (mostly related to the bugs fixed by patches 2 and 3), 10 iterations were performed, taking min,mean,max values for success rates and mean values for time and vmstat-based metrics. First, the default GFP_HIGHUSER_MOVABLE allocations were tested with the patches stacked on top of v3.13-rc2. Patch 2 is OK to serve as baseline due to no functional changes in 1 and 2. Comments below. stress-highalloc 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 2-nothp 3-nothp 4-nothp 5-nothp 6-nothp Success 1 Min 9.00 ( 0.00%) 10.00 (-11.11%) 43.00 (-377.78%) 43.00 (-377.78%) 33.00 (-266.67%) Success 1 Mean 27.50 ( 0.00%) 25.30 ( 8.00%) 45.50 (-65.45%) 45.90 (-66.91%) 46.30 (-68.36%) Success 1 Max 36.00 ( 0.00%) 36.00 ( 0.00%) 47.00 (-30.56%) 48.00 (-33.33%) 52.00 (-44.44%) Success 2 Min 10.00 ( 0.00%) 8.00 ( 20.00%) 46.00 (-360.00%) 45.00 (-350.00%) 35.00 (-250.00%) Success 2 Mean 26.40 ( 0.00%) 23.50 ( 10.98%) 47.30 (-79.17%) 47.60 (-80.30%) 48.10 (-82.20%) Success 2 Max 34.00 ( 0.00%) 33.00 ( 2.94%) 48.00 (-41.18%) 50.00 (-47.06%) 54.00 (-58.82%) Success 3 Min 65.00 ( 0.00%) 63.00 ( 3.08%) 85.00 (-30.77%) 84.00 (-29.23%) 85.00 (-30.77%) Success 3 Mean 76.70 ( 0.00%) 70.50 ( 8.08%) 86.20 (-12.39%) 85.50 (-11.47%) 86.00 (-12.13%) Success 3 Max 87.00 ( 0.00%) 86.00 ( 1.15%) 88.00 ( -1.15%) 87.00 ( 0.00%) 87.00 ( 0.00%) 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 2-nothp 3-nothp 4-nothp 5-nothp 6-nothp User 6437.72 6459.76 5960.32 5974.55 6019.67 System 1049.65 1049.09 1029.32 1031.47 1032.31 Elapsed 1856.77 1874.48 1949.97 1994.22 1983.15 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 2-nothp 3-nothp 4-nothp 5-nothp 6-nothp Minor Faults 253952267 254581900 250030122 250507333 250157829 Major Faults 420 407 506 530 530 Swap Ins 4 9 9 6 6 Swap Outs 398 375 345 346 333 Direct pages scanned 197538 189017 298574 287019 299063 Kswapd pages scanned 1809843 1801308 1846674 1873184 1861089 Kswapd pages reclaimed 1806972 1798684 1844219 1870509 1858622 Direct pages reclaimed 197227 188829 298380 286822 298835 Kswapd efficiency 99% 99% 99% 99% 99% Kswapd velocity 953.382 970.449 952.243 934.569 922.286 Direct efficiency 99% 99% 99% 99% 99% Direct velocity 104.058 101.832 153.961 143.200 148.205 Percentage direct scans 9% 9% 13% 13% 13% Zone normal velocity 347.289 359.676 348.063 339.933 332.983 Zone dma32 velocity 710.151 712.605 758.140 737.835 737.507 Zone dma velocity 0.000 0.000 0.000 0.000 0.000 Page writes by reclaim 557.600 429.000 353.600 426.400 381.800 Page writes file 159 53 7 79 48 Page writes anon 398 375 345 346 333 Page reclaim immediate 825 644 411 575 420 Sector Reads 2781750 2769780 2878547 2939128 2910483 Sector Writes 12080843 12083351 12012892 12002132 12010745 Page rescued immediate 0 0 0 0 0 Slabs scanned `1575654` 1545344 1778406 1786700 1794073 Direct inode steals 9657 10037 15795 14104 14645 Kswapd inode steals 46857 46335 50543 50716 51796 Kswapd skipped wait 0 0 0 0 0 THP fault alloc 97 91 81 71 77 THP collapse alloc 456 506 546 544 565 THP splits 6 5 5 4 4 THP fault fallback 0 1 0 0 0 THP collapse fail 14 14 12 13 12 Compaction stalls 1006 980 1537 1536 1548 Compaction success 303 284 562 559 578 Compaction failures 702 696 974 976 969 Page migrate success 1177325 1070077 3927538 3781870 3877057 Page migrate failure 0 0 0 0 0 Compaction pages isolated 2547248 2306457 8301218 8008500 8200674 Compaction migrate scanned 42290478 38832618 153961130 154143900 159141197 Compaction free scanned 89199429 79189151 356529027 351943166 356326727 Compaction cost 1566 1426 5312 5156 5294 NUMA PTE updates 0 0 0 0 0 NUMA hint faults 0 0 0 0 0 NUMA hint local faults 0 0 0 0 0 NUMA hint local percent 100 100 100 100 100 NUMA pages migrated 0 0 0 0 0 AutoNUMA cost 0 0 0 0 0 Observations: - The "Success 3" line is allocation success rate with system idle (phases 1 and 2 are with background interference). I used to get stable values around 85% with vanilla 3.11. The lower min and mean values came with 3.12. This was bisected to commit `81c0a2bb` ("mm: page_alloc: fair zone allocator policy") As explained in comment for patch 3, I don't think the commit is wrong, but that it makes the effect of compaction bugs worse. From patch 3 onwards, the results are OK and match the 3.11 results. - Patch 4 also clearly helps phases 1 and 2, and exceeds any results I've seen with 3.11 (I didn't measure it that thoroughly then, but it was never above 40%). - Compaction cost and number of scanned pages is higher, especially due to patch 4. However, keep in mind that patches 3 and 4 fix existing bugs in the current design of compaction overhead mitigation, they do not change it. If overhead is found unacceptable, then it should be decreased differently (and consistently, not due to random conditions) than the current implementation does. In contrast, patches 5 and 6 (which are not strictly bug fixes) do not increase the overhead (but also not success rates). This might be a limitation of the stress-highalloc benchmark as it's quite uniform. Another set of results is when configuring stress-highalloc t allocate with similar flags as THP uses: (GFP_HIGHUSER_MOVABLE\|__GFP_NOMEMALLOC\|__GFP_NORETRY\|__GFP_NO_KSWAPD) stress-highalloc 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 2-thp 3-thp 4-thp 5-thp 6-thp Success 1 Min 2.00 ( 0.00%) 7.00 (-250.00%) 18.00 (-800.00%) 19.00 (-850.00%) 26.00 (-1200.00%) Success 1 Mean 19.20 ( 0.00%) 17.80 ( 7.29%) 29.20 (-52.08%) 29.90 (-55.73%) 32.80 (-70.83%) Success 1 Max 27.00 ( 0.00%) 29.00 ( -7.41%) 35.00 (-29.63%) 36.00 (-33.33%) 37.00 (-37.04%) Success 2 Min 3.00 ( 0.00%) 8.00 (-166.67%) 21.00 (-600.00%) 21.00 (-600.00%) 32.00 (-966.67%) Success 2 Mean 19.30 ( 0.00%) 17.90 ( 7.25%) 32.20 (-66.84%) 32.60 (-68.91%) 35.70 (-84.97%) Success 2 Max 27.00 ( 0.00%) 30.00 (-11.11%) 36.00 (-33.33%) 37.00 (-37.04%) 39.00 (-44.44%) Success 3 Min 62.00 ( 0.00%) 62.00 ( 0.00%) 85.00 (-37.10%) 75.00 (-20.97%) 64.00 ( -3.23%) Success 3 Mean 66.30 ( 0.00%) 65.50 ( 1.21%) 85.60 (-29.11%) 83.40 (-25.79%) 83.50 (-25.94%) Success 3 Max 70.00 ( 0.00%) 69.00 ( 1.43%) 87.00 (-24.29%) 86.00 (-22.86%) 87.00 (-24.29%) 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 2-thp 3-thp 4-thp 5-thp 6-thp User 6547.93 6475.85 6265.54 6289.46 6189.96 System 1053.42 1047.28 1043.23 1042.73 1038.73 Elapsed 1835.43 1821.96 1908.67 1912.74 1956.38 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 3.13-rc2 2-thp 3-thp 4-thp 5-thp 6-thp Minor Faults 256805673 253106328 253222299 249830289 251184418 Major Faults 395 375 423 434 448 Swap Ins 12 10 10 12 9 Swap Outs 530 537 487 455 415 Direct pages scanned 71859 86046 153244 152764 190713 Kswapd pages scanned 1900994 1870240 1898012 1892864 1880520 Kswapd pages reclaimed 1897814 1867428 1894939 1890125 1877924 Direct pages reclaimed 71766 85908 153167 152643 190600 Kswapd efficiency 99% 99% 99% 99% 99% Kswapd velocity 1029.000 1067.782 1000.091 991.049 951.218 Direct efficiency 99% 99% 99% 99% 99% Direct velocity 38.897 49.127 80.747 79.983 96.468 Percentage direct scans 3% 4% 7% 7% 9% Zone normal velocity 351.377 372.494 348.910 341.689 335.310 Zone dma32 velocity 716.520 744.414 731.928 729.343 712.377 Zone dma velocity 0.000 0.000 0.000 0.000 0.000 Page writes by reclaim 669.300 604.000 545.700 538.900 429.900 Page writes file 138 66 58 83 14 Page writes anon 530 537 487 455 415 Page reclaim immediate 806 655 772 548 517 Sector Reads 2711956 2703239 2811602 2818248 2839459 Sector Writes 12163238 12018662 12038248 11954736 11994892 Page rescued immediate 0 0 0 0 0 Slabs scanned 1385088 1388364 1507968 1513292 1558656 Direct inode steals 1739 2564 4622 5496 6007 Kswapd inode steals 47461 46406 47804 48013 48466 Kswapd skipped wait 0 0 0 0 0 THP fault alloc 110 82 84 69 70 THP collapse alloc 445 482 467 462 539 THP splits 6 5 4 5 3 THP fault fallback 3 0 0 0 0 THP collapse fail 15 14 14 14 13 Compaction stalls 659 685 1033 1073 1111 Compaction success 222 225 410 427 456 Compaction failures 436 460 622 646 655 Page migrate success 446594 439978 1085640 1095062 1131716 Page migrate failure 0 0 0 0 0 Compaction pages isolated 1029475 1013490 2453074 2482698 2565400 Compaction migrate scanned `9955461` 11344259 24375202 27978356 30494204 Compaction free scanned 27715272 28544654 80150615 82898631 85756132 Compaction cost 552 555 1344 1379 1436 NUMA PTE updates 0 0 0 0 0 NUMA hint faults 0 0 0 0 0 NUMA hint local faults 0 0 0 0 0 NUMA hint local percent 100 100 100 100 100 NUMA pages migrated 0 0 0 0 0 AutoNUMA cost 0 0 0 0 0 There are some differences from the previous results for THP-like allocations: - Here, the bad result for unpatched kernel in phase 3 is much more consistent to be between 65-70% and not related to the "regression" in 3.12. Still there is the improvement from patch 4 onwards, which brings it on par with simple GFP_HIGHUSER_MOVABLE allocations. - Compaction costs have increased, but nowhere near as much as the non-THP case. Again, the patches should be worth the gained determininsm. - Patches 5 and 6 somewhat increase the number of migrate-scanned pages. This is most likely due to __GFP_NO_KSWAPD flag, which means the cached pfn's and pageblock skip bits are not reset by kswapd that often (at least in phase 3 where no concurrent activity would wake up kswapd) and the patches thus help the sync-after-async compaction. It doesn't however show that the sync compaction would help so much with success rates, which can be again seen as a limitation of the benchmark scenario. This patch (of 6): Add two tracepoints for compaction begin and end of a zone. Using this it is possible to calculate how much time a workload is spending within compaction and potentially debug problems related to cached pfns for scanning. In combination with the direct reclaim and slab trace points it should be possible to estimate most allocation-related overhead for a workload. Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Cc: Rik van Riel <riel@redhat.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:48 -08:00
Mel Gorman	286549dcaf	sched: add tracepoints related to NUMA task migration This patch adds three tracepoints o trace_sched_move_numa when a task is moved to a node o trace_sched_swap_numa when a task is swapped with another task o trace_sched_stick_numa when a numa-related migration fails The tracepoints allow the NUMA scheduler activity to be monitored and the following high-level metrics can be calculated o NUMA migrated stuck nr trace_sched_stick_numa o NUMA migrated idle nr trace_sched_move_numa o NUMA migrated swapped nr trace_sched_swap_numa o NUMA local swapped trace_sched_swap_numa src_nid == dst_nid (should never happen) o NUMA remote swapped trace_sched_swap_numa src_nid != dst_nid (should == NUMA migrated swapped) o NUMA group swapped trace_sched_swap_numa src_ngid == dst_ngid Maybe a small number of these are acceptable but a high number would be a major surprise. It would be even worse if bounces are frequent. o NUMA avg task migs. Average number of migrations for tasks o NUMA stddev task mig Self-explanatory o NUMA max task migs. Maximum number of migrations for a single task In general the intent of the tracepoints is to help diagnose problems where automatic NUMA balancing appears to be doing an excessive amount of useless work. [akpm@linux-foundation.org: remove semicolon-after-if, repair coding-style] Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:48 -08:00
Mel Gorman	af1839d722	mm: numa: trace tasks that fail migration due to rate limiting A low local/remote numa hinting fault ratio is potentially explained by failed migrations. This patch adds a tracepoint that fires when migration fails due to migration rate limitation. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:48 -08:00
Mel Gorman	1c5e9c27cb	mm: numa: limit scope of lock for NUMA migrate rate limiting NUMA migrate rate limiting protects a migration counter and window using a lock but in some cases this can be a contended lock. It is not critical that the number of pages be perfect, lost updates are acceptable. Reduce the importance of this lock. Signed-off-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Rik van Riel <riel@redhat.com> Cc: Alex Thorlton <athorlton@sgi.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:48 -08:00
Santosh Shilimkar	26f09e9b3a	mm/memblock: add memblock memory allocation apis Introduce memblock memory allocation APIs which allow to support PAE or LPAE extension on 32 bits archs where the physical memory start address can be beyond 4GB. In such cases, existing bootmem APIs which operate on 32 bit addresses won't work and needs memblock layer which operates on 64 bit addresses. So we add equivalent APIs so that we can replace usage of bootmem with memblock interfaces. Architectures already converted to NO_BOOTMEM use these new memblock interfaces. The architectures which are still not converted to NO_BOOTMEM continue to function as is because we still maintain the fal lback option of bootmem back-end supporting these new interfaces. So no functional change as such. In long run, once all the architectures moves to NO_BOOTMEM, we can get rid of bootmem layer completely. This is one step to remove the core code dependency with bootmem and also gives path for architectures to move away from bootmem. The proposed interface will became active if both CONFIG_HAVE_MEMBLOCK and CONFIG_NO_BOOTMEM are specified by arch. In case !CONFIG_NO_BOOTMEM, the memblock() wrappers will fallback to the existing bootmem apis so that arch's not converted to NO_BOOTMEM continue to work as is. The meaning of MEMBLOCK_ALLOC_ACCESSIBLE and MEMBLOCK_ALLOC_ANYWHERE is kept same. [akpm@linux-foundation.org: s/depricated/deprecated/] Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:46 -08:00
Grygorii Strashko	b115423357	mm/memblock: switch to use NUMA_NO_NODE instead of MAX_NUMNODES It's recommended to use NUMA_NO_NODE everywhere to select "process any node" behavior or to indicate that "no node id specified". Hence, update __next_free_mem_range*() API's to accept both NUMA_NO_NODE and MAX_NUMNODES, but emit warning once on MAX_NUMNODES, and correct corresponding API's documentation to describe new behavior. Also, update other memblock/nobootmem APIs where MAX_NUMNODES is used dirrectly. The change was suggested by Tejun Heo. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:46 -08:00
Grygorii Strashko	87029ee939	mm/memblock: reorder parameters of memblock_find_in_range_node Reorder parameters of memblock_find_in_range_node to be consistent with other memblock APIs. The change was suggested by Tejun Heo <tj@kernel.org>. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tejun Heo <tj@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:46 -08:00
Grygorii Strashko	10e89523bf	mm/bootmem: remove duplicated declaration of __free_pages_bootmem() The __free_pages_bootmem is used internally by MM core and already defined in internal.h. So, remove duplicated declaration. Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Reviewed-by: Tejun Heo <tj@kernel.org> Cc: Yinghai Lu <yinghai@kernel.org> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: Paul Walmsley <paul@pwsan.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Russell King <linux@arm.linux.org.uk> Cc: Tony Lindgren <tony@atomide.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:46 -08:00
Oleg Nesterov	0c740d0afc	introduce for_each_thread() to replace the buggy while_each_thread() while_each_thread() and next_thread() should die, almost every lockless usage is wrong. 1. Unless g == current, the lockless while_each_thread() is not safe. while_each_thread(g, t) can loop forever if g exits, next_thread() can't reach the unhashed thread in this case. Note that this can happen even if g is the group leader, it can exec. 2. Even if while_each_thread() itself was correct, people often use it wrongly. It was never safe to just take rcu_read_lock() and loop unless you verify that pid_alive(g) == T, even the first next_thread() can point to the already freed/reused memory. This patch adds signal_struct->thread_head and task->thread_node to create the normal rcu-safe list with the stable head. The new for_each_thread(g, t) helper is always safe under rcu_read_lock() as long as this task_struct can't go away. Note: of course it is ugly to have both task_struct->thread_node and the old task_struct->thread_group, we will kill it later, after we change the users of while_each_thread() to use for_each_thread(). Perhaps we can kill it even before we convert all users, we can reimplement next_thread(t) using the new thread_head/thread_node. But we can't do this right now because this will lead to subtle behavioural changes. For example, do/while_each_thread() always sees at least one task, while for_each_thread() can do nothing if the whole thread group has died. Or thread_group_empty(), currently its semantics is not clear unless thread_group_leader(p) and we need to audit the callers before we can change it. So this patch adds the new interface which has to coexist with the old one for some time, hopefully the next changes will be more or less straightforward and the old one will go away soon. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Sergey Dyasly <dserrg@gmail.com> Tested-by: Sergey Dyasly <dserrg@gmail.com> Reviewed-by: Sameer Nanda <snanda@chromium.org> Acked-by: David Rientjes <rientjes@google.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mandeep Singh Baines <msb@chromium.org> Cc: "Ma, Xindong" <xindong.ma@intel.com> Cc: Michal Hocko <mhocko@suse.cz> Cc: "Tu, Xiaobing" <xiaobing.tu@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:46 -08:00
Joonsoo Kim	9f32624be9	mm/rmap: use rmap_walk() in page_referenced() Now, we have an infrastructure in rmap_walk() to handle difference from variants of rmap traversing functions. So, just use it in page_referenced(). In this patch, I change following things. 1. remove some variants of rmap traversing functions. cf> page_referenced_ksm, page_referenced_anon, page_referenced_file 2. introduce new struct page_referenced_arg and pass it to page_referenced_one(), main function of rmap_walk, in order to count reference, to store vm_flags and to check finish condition. 3. mechanical change to use rmap_walk() in page_referenced(). [liwanp@linux.vnet.ibm.com: fix BUG at rmap_walk] Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:45 -08:00
Joonsoo Kim	e8351ac9bf	mm/rmap: use rmap_walk() in try_to_munlock() Now, we have an infrastructure in rmap_walk() to handle difference from variants of rmap traversing functions. So, just use it in try_to_munlock(). In this patch, I change following things. 1. remove some variants of rmap traversing functions. cf> try_to_unmap_ksm, try_to_unmap_anon, try_to_unmap_file 2. mechanical change to use rmap_walk() in try_to_munlock(). 3. copy and paste comments. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:45 -08:00
Joonsoo Kim	5262950642	mm/rmap: use rmap_walk() in try_to_unmap() Now, we have an infrastructure in rmap_walk() to handle difference from variants of rmap traversing functions. So, just use it in try_to_unmap(). In this patch, I change following things. 1. enable rmap_walk() if !CONFIG_MIGRATION. 2. mechanical change to use rmap_walk() in try_to_unmap(). Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:45 -08:00
Joonsoo Kim	0dd1c7bbce	mm/rmap: extend rmap_walk_xxx() to cope with different cases There are a lot of common parts in traversing functions, but there are also a little of uncommon parts in it. By assigning proper function pointer on each rmap_walker_control, we can handle these difference correctly. Following are differences we should handle. 1. difference of lock function in anon mapping case 2. nonlinear handling in file mapping case 3. prechecked condition: checking memcg in page_referenced(), checking VM_SHARE in page_mkclean() checking temporary vma in try_to_unmap() 4. exit condition: checking page_mapped() in try_to_unmap() So, in this patch, I introduce 4 function pointers to handle above differences. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:45 -08:00
Joonsoo Kim	051ac83adf	mm/rmap: make rmap_walk to get the rmap_walk_control argument In each rmap traverse case, there is some difference so that we need function pointers and arguments to them in order to handle these For this purpose, struct rmap_walk_control is introduced in this patch, and will be extended in following patch. Introducing and extending are separate, because it clarify changes. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Hugh Dickins <hughd@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Hillf Danton <dhillf@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:45 -08:00
Tang Chen	55ac590c2f	memblock, mem_hotplug: make memblock skip hotpluggable regions if needed Linux kernel cannot migrate pages used by the kernel. As a result, hotpluggable memory used by the kernel won't be able to be hot-removed. To solve this problem, the basic idea is to prevent memblock from allocating hotpluggable memory for the kernel at early time, and arrange all hotpluggable memory in ACPI SRAT(System Resource Affinity Table) as ZONE_MOVABLE when initializing zones. In the previous patches, we have marked hotpluggable memory regions with MEMBLOCK_HOTPLUG flag in memblock.memory. In this patch, we make memblock skip these hotpluggable memory regions in the default top-down allocation function if movable_node boot option is specified. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:45 -08:00
Tang Chen	e7e8de5918	memblock: make memblock_set_node() support different memblock_type [sfr@canb.auug.org.au: fix powerpc build] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:44 -08:00
Tang Chen	66b16edf9e	memblock, mem_hotplug: introduce MEMBLOCK_HOTPLUG flag to mark hotpluggable regions In find_hotpluggable_memory, once we find out a memory region which is hotpluggable, we want to mark them in memblock.memory. So that we could control memblock allocator not to allocte hotpluggable memory for the kernel later. To achieve this goal, we introduce MEMBLOCK_HOTPLUG flag to indicate the hotpluggable memory regions in memblock and a function memblock_mark_hotplug() to mark hotpluggable memory if we find one. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com> Reviewed-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: "Rafael J . Wysocki" <rjw@sisk.pl> Cc: Chen Tang <imtangchen@gmail.com> Cc: Gong Chen <gong.chen@linux.intel.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Larry Woodman <lwoodman@redhat.com> Cc: Len Brown <lenb@kernel.org> Cc: Liu Jiang <jiang.liu@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Nazarewicz <mina86@mina86.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Rik van Riel <riel@redhat.com> Cc: Taku Izumi <izumi.taku@jp.fujitsu.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Renninger <trenn@suse.de> Cc: Toshi Kani <toshi.kani@hp.com> Cc: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com> Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com> Cc: Wen Congyang <wency@cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Cc: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2014-01-21 16:19:44 -08:00

1 2 3 4 5 ...

63965 Commits