linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-22 09:48:07 +07:00

Author	SHA1	Message	Date
Al Viro	43b1682024	make sure that /linuxrc has std{in,out,err} Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2013-01-19 13:29:54 -05:00
Tejun Heo	bb813f4c93	init, block: try to load default elevator module early during boot This patch adds default module loading and uses it to load the default block elevator. During boot, it's called right after initramfs or initrd is made available and right before control is passed to userland. This ensures that as long as the modules are available in the usual places in initramfs, initrd or the root filesystem, the default modules are loaded as soon as possible. This will replace the on-demand elevator module loading from elevator init path. v2: Fixed build breakage when !CONFIG_BLOCK. Reported by kbuild test robot. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alex Riesen <raa.lkml@gmail.com> Cc: Fengguang We <fengguang.wu@intel.com>	2013-01-18 14:05:56 -08:00
Greg Kroah-Hartman	ed408f7c0f	Merge 3.9-rc4 into driver-core-next This is to fix up a build problem with a wireless driver due to the dynamic-debug patches in this branch. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2013-01-17 19:48:18 -08:00
Kirill Smelkov	3a55fb0d9f	Tell the world we gave up on pushing CC_OPTIMIZE_FOR_SIZE In commit `281dc5c5ec` ("Give up on pushing CC_OPTIMIZE_FOR_SIZE") we already changed the actual default value, but the help-text still suggested 'y'. Fix the help text too, for all the same reasons. Sadly, -Os keeps on generating some very suboptimal code for certain cases, to the point where any I$ miss upside is swamped by the downside. The main ones are: - using "rep movsb" for memcpy, even on CPU's where that is horrendously bad for performance. - not honoring branch prediction information, so any I$ footprint you win from smaller code, you lose from less code density in the I$. - using divide instructions when that is very expensive. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-01-16 12:42:57 -08:00
Kees Cook	19c9239981	init: remove depends on CONFIG_EXPERIMENTAL The CONFIG_EXPERIMENTAL config item has not carried much meaning for a while now and is almost always enabled by default. As agreed during the Linux kernel summit, remove it from any "depends on" lines in Kconfigs. CC: "Eric W. Biederman" <ebiederm@xmission.com> CC: Serge Hallyn <serge.hallyn@canonical.com> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>	2013-01-11 11:39:33 -08:00
Kees Cook	5a958db311	make CONFIG_EXPERIMENTAL invisible and default This config item has not carried much meaning for a while now and is almost always enabled by default (especially in distro builds). As agreed during the Linux kernel summit, it should be removed. As a first step, remove it from being listed, and default it to on. Once it has been removed from all subsystem Kconfigs, it will be dropped entirely. For items that really are experimental, maintainers should use "default n", optionally include "(EXPERIMENTAL)" in the title, and add language to the help text indicating why the item should be considered experimental. For items that are dangerously experimental, the maintainer is encouraged to follow the above title recommendation, add stronger language to the help text, and optionally use (depending on the extent of the danger, from least to most dangerous): printk(), add_taint(TAINT_WARN), add_taint(TAINT_CRAP), WARN_ON(1), and CONFIG_BROKEN. CC: Greg KH <gregkh@linuxfoundation.org> CC: "Eric W. Biederman" <ebiederm@xmission.com> CC: Serge Hallyn <serge.hallyn@canonical.com> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> CC: Andrew Morton <akpm@linux-foundation.org> CC: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2013-01-11 11:38:03 -08:00
Vineet Gupta	b6fca72536	sysctl: Enable IA64 "ignore-unaligned-usertrap" to be used cross-arch IA64 defines /proc/sys/kernel/ignore-unaligned-usertrap to control verbose warnings on unaligned access emulation. Although the exact mechanics of what to do with sysctl (ignore/shout) are arch specific, this change enables the sysctl to be usable cross-arch. Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Tony Luck <tony.luck@intel.com>	2013-01-09 10:48:34 -08:00
Vineet Gupta	f80b0c904d	Ensure that kernel_init_freeable() is not inlined into non __init code Commit `d6b2123802` "make sure that we always have a return path from kernel_execve()" reshuffled kernel_init()/init_post() to ensure that kernel_execve() has a caller to return to. It removed __init annotation for kernel_init() and introduced/calls a new routine kernel_init_freeable(). Latter however is inlined by any reasonable compiler (ARC gcc 4.4 in this case), causing slight code bloat. This patch forces kernel_init_freeable() as noinline reducing the .text bloat-o-meter vmlinux vmlinux_new add/remove: 1/0 grow/shrink: 0/1 up/down: 374/-334 (40) function old new delta kernel_init_freeable - 374 +374 (.init.text) kernel_init 628 294 -334 (.text) Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-12-26 01:14:12 -05:00
Linus Torvalds	54d46ea993	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull signal handling cleanups from Al Viro: "sigaltstack infrastructure + conversion for x86, alpha and um, COMPAT_SYSCALL_DEFINE infrastructure. Note that there are several conflicts between "unify SS_ONSTACK/SS_DISABLE definitions" and UAPI patches in mainline; resolution is trivial - just remove definitions of SS_ONSTACK and SS_DISABLED from arch//uapi/asm/signal.h; they are all identical and include/uapi/linux/signal.h contains the unified variant." Fixed up conflicts as per Al. 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: alpha: switch to generic sigaltstack new helpers: __save_altstack/__compat_save_altstack, switch x86 and um to those generic compat_sys_sigaltstack() introduce generic sys_sigaltstack(), switch x86 and um to it new helper: compat_user_stack_pointer() new helper: restore_altstack() unify SS_ONSTACK/SS_DISABLE definitions new helper: current_user_stack_pointer() missing user_stack_pointer() instances Bury the conditionals from kernel_thread/kernel_execve series COMPAT_SYSCALL_DEFINE: infrastructure	2012-12-20 18:05:28 -08:00
Al Viro	ae903caae2	Bury the conditionals from kernel_thread/kernel_execve series All architectures have CONFIG_GENERIC_KERNEL_THREAD CONFIG_GENERIC_KERNEL_EXECVE __ARCH_WANT_SYS_EXECVE None of them have __ARCH_WANT_KERNEL_EXECVE and there are only two callers of kernel_execve() (which is a trivial wrapper for do_execve() now) left. Kill the conditionals and make both callers use do_execve(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-12-19 18:07:38 -05:00
Glauber Costa	d7f25f8a2f	memcg: infrastructure to match an allocation to the right cache The page allocator is able to bind a page to a memcg when it is allocated. But for the caches, we'd like to have as many objects as possible in a page belonging to the same cache. This is done in this patch by calling memcg_kmem_get_cache in the beginning of every allocation function. This function is patched out by static branches when kernel memory controller is not being used. It assumes that the task allocating, which determines the memcg in the page allocator, belongs to the same cgroup throughout the whole process. Misaccounting can happen if the task calls memcg_kmem_get_cache() while belonging to a cgroup, and later on changes. This is considered acceptable, and should only happen upon task migration. Before the cache is created by the memcg core, there is also a possible imbalance: the task belongs to a memcg, but the cache being allocated from is the global cache, since the child cache is not yet guaranteed to be ready. This case is also fine, since in this case the GFP_KMEMCG will not be passed and the page allocator will not attempt any cgroup accounting. Signed-off-by: Glauber Costa <glommer@parallels.com> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: JoonSoo Kim <js1304@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Michal Hocko <mhocko@suse.cz> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Cc: Suleiman Souhlal <suleiman@google.com> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:14 -08:00
Glauber Costa	510fc4e11b	memcg: kmem accounting basic infrastructure Add the basic infrastructure for the accounting of kernel memory. To control that, the following files are created: * memory.kmem.usage_in_bytes * memory.kmem.limit_in_bytes * memory.kmem.failcnt * memory.kmem.max_usage_in_bytes They have the same meaning of their user memory counterparts. They reflect the state of the "kmem" res_counter. Per cgroup kmem memory accounting is not enabled until a limit is set for the group. Once the limit is set the accounting cannot be disabled for that group. This means that after the patch is applied, no behavioral changes exists for whoever is still using memcg to control their memory usage, until memory.kmem.limit_in_bytes is set for the first time. We always account to both user and kernel resource_counters. This effectively means that an independent kernel limit is in place when the limit is set to a lower value than the user memory. A equal or higher value means that the user limit will always hit first, meaning that kmem is effectively unlimited. People who want to track kernel memory but not limit it, can set this limit to a very high number (like RESOURCE_MAX - 1page - that no one will ever hit, or equal to the user memory) [akpm@linux-foundation.org: MEMCG_MMEM only works with slab and slub] Signed-off-by: Glauber Costa <glommer@parallels.com> Acked-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux.com> Cc: David Rientjes <rientjes@google.com> Cc: Frederic Weisbecker <fweisbec@redhat.com> Cc: Greg Thelen <gthelen@google.com> Cc: JoonSoo Kim <js1304@gmail.com> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Pekka Enberg <penberg@cs.helsinki.fi> Cc: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-18 15:02:12 -08:00
Linus Torvalds	6a2b60b17b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "While small this set of changes is very significant with respect to containers in general and user namespaces in particular. The user space interface is now complete. This set of changes adds support for unprivileged users to create user namespaces and as a user namespace root to create other namespaces. The tyranny of supporting suid root preventing unprivileged users from using cool new kernel features is broken. This set of changes completes the work on setns, adding support for the pid, user, mount namespaces. This set of changes includes a bunch of basic pid namespace cleanups/simplifications. Of particular significance is the rework of the pid namespace cleanup so it no longer requires sending out tendrils into all kinds of unexpected cleanup paths for operation. At least one case of broken error handling is fixed by this cleanup. The files under /proc/<pid>/ns/ have been converted from regular files to magic symlinks which prevents incorrect caching by the VFS, ensuring the files always refer to the namespace the process is currently using and ensuring that the ptrace_mayaccess permission checks are always applied. The files under /proc/<pid>/ns/ have been given stable inode numbers so it is now possible to see if different processes share the same namespaces. Through the David Miller's net tree are changes to relax many of the permission checks in the networking stack to allowing the user namespace root to usefully use the networking stack. Similar changes for the mount namespace and the pid namespace are coming through my tree. Two small changes to add user namespace support were commited here adn in David Miller's -net tree so that I could complete the work on the /proc/<pid>/ns/ files in this tree. Work remains to make it safe to build user namespaces and 9p, afs, ceph, cifs, coda, gfs2, ncpfs, nfs, nfsd, ocfs2, and xfs so the Kconfig guard remains in place preventing that user namespaces from being built when any of those filesystems are enabled. Future design work remains to allow root users outside of the initial user namespace to mount more than just /proc and /sys." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (38 commits) proc: Usable inode numbers for the namespace file descriptors. proc: Fix the namespace inode permission checks. proc: Generalize proc inode allocation userns: Allow unprivilged mounts of proc and sysfs userns: For /proc/self/{uid,gid}_map derive the lower userns from the struct file procfs: Print task uids and gids in the userns that opened the proc file userns: Implement unshare of the user namespace userns: Implent proc namespace operations userns: Kill task_user_ns userns: Make create_new_namespaces take a user_ns parameter userns: Allow unprivileged use of setns. userns: Allow unprivileged users to create new namespaces userns: Allow setting a userns mapping to your current uid. userns: Allow chown and setgid preservation userns: Allow unprivileged users to create user namespaces. userns: Ignore suid and sgid on binaries if the uid or gid can not be mapped userns: fix return value on mntns_install() failure vfs: Allow unprivileged manipulation of the mount namespace. vfs: Only support slave subtrees across different user namespaces vfs: Add a user namespace reference from struct mnt_namespace ...	2012-12-17 15:44:47 -08:00
Linus Torvalds	9228ff9038	Merge branch 'for-3.8/drivers' of git://git.kernel.dk/linux-block Pull block driver update from Jens Axboe: "Now that the core bits are in, here are the driver bits for 3.8. The branch contains: - A huge pile of drbd bits that were dumped from the 3.7 merge window. Following that, it was both made perfectly clear that there is going to be no more over-the-wall pulls and how the situation on individual pulls can be improved. - A few cleanups from Akinobu Mita for drbd and cciss. - Queue improvement for loop from Lukas. This grew into adding a generic interface for waiting/checking an even with a specific lock, allowing this to be pulled out of md and now loop and drbd is also using it. - A few fixes for xen back/front block driver from Roger Pau Monne. - Partition improvements from Stephen Warren, allowing partiion UUID to be used as an identifier." * 'for-3.8/drivers' of git://git.kernel.dk/linux-block: (609 commits) drbd: update Kconfig to match current dependencies drbd: Fix drbdsetup wait-connect, wait-sync etc... commands drbd: close race between drbd_set_role and drbd_connect drbd: respect no-md-barriers setting also when changed online via disk-options drbd: Remove obsolete check drbd: fixup after wait_even_lock_irq() addition to generic code loop: Limit the number of requests in the bio list wait: add wait_event_lock_irq() interface xen-blkfront: free allocated page xen-blkback: move free persistent grants code block: partition: msdos: provide UUIDs for partitions init: reduce PARTUUID min length to 1 from 36 block: store partition_meta_info.uuid as a string cciss: use check_signature() cciss: cleanup bitops usage drbd: use copy_highpage drbd: if the replication link breaks during handshake, keep retrying drbd: check return of kmalloc in receive_uuids drbd: Broadcast sync progress no more often than once per second drbd: don't try to clear bits once the disk has failed ...	2012-12-17 13:39:11 -08:00
Linus Torvalds	3d59eebc5e	Automatic NUMA Balancing V11 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQIcBAABAgAGBQJQx0kQAAoJEHzG/DNEskfi4fQP/R5PRovayroZALBMLnVJDaLD Ttr9p40VNXbiJ+MfRgatJjSSJZ4Jl+fC3NEqBhcwVZhckZZb9R2s0WtrSQo5+ZbB vdRfiuKoCaKM4cSZ08C12uTvsF6xjhjd27CTUlMkyOcDoKxMEFKelv0hocSxe4Wo xqlv3eF+VsY7kE1BNbgBP06SX4tDpIHRxXfqJPMHaSKQmre+cU0xG2GcEu3QGbHT DEDTI788YSaWLmBfMC+kWoaQl1+bV/FYvavIAS8/o4K9IKvgR42VzrXmaFaqrbgb 72ksa6xfAi57yTmZHqyGmts06qYeBbPpKI+yIhCMInxA9CY3lPbvHppRf0RQOyzj YOi4hovGEMJKE+BCILukhJcZ9jCTtS3zut6v1rdvR88f4y7uhR9RfmRfsxuW7PNj 3Rmh191+n0lVWDmhOs2psXuCLJr3LEiA0dFffN1z8REUTtTAZMsj8Rz+SvBNAZDR hsJhERVeXB6X5uQ5rkLDzbn1Zic60LjVw7LIp6SF2OYf/YKaF8vhyWOA8dyCEu8W CGo7AoG0BO8tIIr8+LvFe8CweypysZImx4AjCfIs4u9pu/v11zmBvO9NO5yfuObF BreEERYgTes/UITxn1qdIW4/q+Nr0iKO3CTqsmu6L1GfCz3/XzPGs3U26fUhllqi Ka0JKgnWvsa6ez6FSzKI =ivQa -----END PGP SIGNATURE----- Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma Pull Automatic NUMA Balancing bare-bones from Mel Gorman: "There are three implementations for NUMA balancing, this tree (balancenuma), numacore which has been developed in tip/master and autonuma which is in aa.git. In almost all respects balancenuma is the dumbest of the three because its main impact is on the VM side with no attempt to be smart about scheduling. In the interest of getting the ball rolling, it would be desirable to see this much merged for 3.8 with the view to building scheduler smarts on top and adapting the VM where required for 3.9. The most recent set of comparisons available from different people are mel: https://lkml.org/lkml/2012/12/9/108 mingo: https://lkml.org/lkml/2012/12/7/331 tglx: https://lkml.org/lkml/2012/12/10/437 srikar: https://lkml.org/lkml/2012/12/10/397 The results are a mixed bag. In my own tests, balancenuma does reasonably well. It's dumb as rocks and does not regress against mainline. On the other hand, Ingo's tests shows that balancenuma is incapable of converging for this workloads driven by perf which is bad but is potentially explained by the lack of scheduler smarts. Thomas' results show balancenuma improves on mainline but falls far short of numacore or autonuma. Srikar's results indicate we all suffer on a large machine with imbalanced node sizes. My own testing showed that recent numacore results have improved dramatically, particularly in the last week but not universally. We've butted heads heavily on system CPU usage and high levels of migration even when it shows that overall performance is better. There are also cases where it regresses. Of interest is that for specjbb in some configurations it will regress for lower numbers of warehouses and show gains for higher numbers which is not reported by the tool by default and sometimes missed in treports. Recently I reported for numacore that the JVM was crashing with NullPointerExceptions but currently it's unclear what the source of this problem is. Initially I thought it was in how numacore batch handles PTEs but I'm no longer think this is the case. It's possible numacore is just able to trigger it due to higher rates of migration. These reports were quite late in the cycle so I/we would like to start with this tree as it contains much of the code we can agree on and has not changed significantly over the last 2-3 weeks." * tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits) mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable mm/rmap: Convert the struct anon_vma::mutex to an rwsem mm: migrate: Account a transhuge page properly when rate limiting mm: numa: Account for failed allocations and isolations as migration failures mm: numa: Add THP migration for the NUMA working set scanning fault case build fix mm: numa: Add THP migration for the NUMA working set scanning fault case. mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG mm: sched: numa: Control enabling and disabling of NUMA balancing mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task<->node relationships mm: numa: migrate: Set last_nid on newly allocated page mm: numa: split_huge_page: Transfer last_nid on tail page mm: numa: Introduce last_nid to the page frame sched: numa: Slowly increase the scanning period as NUMA faults are handled mm: numa: Rate limit setting of pte_numa if node is saturated mm: numa: Rate limit the amount of memory that is migrated between nodes mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting mm: numa: Migrate pages handled during a pmd_numa hinting fault mm: numa: Migrate on reference policy ...	2012-12-16 15:18:08 -08:00
Linus Torvalds	11520e5e7c	Revert "x86-64/efi: Use EFI to deal with platform wall clock (again)" This reverts commit `bd52276fa1` ("x86-64/efi: Use EFI to deal with platform wall clock (again)"), and the two supporting commits: `da5a108d05`: "x86/kernel: remove tboot 1:1 page table creation code" `185034e72d`: "x86, efi: 1:1 pagetable mapping for virtual EFI calls") as they all depend semantically on commit `53b87cf088` ("x86, mm: Include the entire kernel memory map in trampoline_pgd") that got reverted earlier due to the problems it caused. This was pointed out by Yinghai Lu, and verified by me on my Macbook Air that uses EFI. Pointed-out-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-15 15:20:41 -08:00
Linus Torvalds	d42b3a2906	Merge branch 'core-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 EFI update from Peter Anvin: "EFI tree, from Matt Fleming. Most of the patches are the new efivarfs filesystem by Matt Garrett & co. The balance are support for EFI wallclock in the absence of a hardware-specific driver, and various fixes and cleanups." * 'core-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) efivarfs: Make efivarfs_fill_super() static x86, efi: Check table header length in efi_bgrt_init() efivarfs: Use query_variable_info() to limit kmalloc() efivarfs: Fix return value of efivarfs_file_write() efivarfs: Return a consistent error when efivarfs_get_inode() fails efivarfs: Make 'datasize' unsigned long efivarfs: Add unique magic number efivarfs: Replace magic number with sizeof(attributes) efivarfs: Return an error if we fail to read a variable efi: Clarify GUID length calculations efivarfs: Implement exclusive access for {get,set}_variable efivarfs: efivarfs_fill_super() ensure we clean up correctly on error efivarfs: efivarfs_fill_super() ensure we free our temporary name efivarfs: efivarfs_fill_super() fix inode reference counts efivarfs: efivarfs_create() ensure we drop our reference on inode on error efivarfs: efivarfs_file_read ensure we free data in error paths x86-64/efi: Use EFI to deal with platform wall clock (again) x86/kernel: remove tboot 1:1 page table creation code x86, efi: 1:1 pagetable mapping for virtual EFI calls x86, mm: Include the entire kernel memory map in trampoline_pgd ...	2012-12-14 10:08:40 -08:00
Lai Jiangshan	3c466d46a9	init: use N_MEMORY instead N_HIGH_MEMORY N_HIGH_MEMORY stands for the nodes that has normal or high memory. N_MEMORY stands for the nodes that has any memory. The code here need to handle with the nodes which have memory, we should use N_MEMORY instead. Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com> Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Cc: Christoph Lameter <cl@linux.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Lin Feng <linfeng@cn.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-12-12 17:38:33 -08:00
Mel Gorman	1a687c2e9a	mm: sched: numa: Control enabling and disabling of NUMA balancing This patch adds Kconfig options and kernel parameters to allow the enabling and disabling of automatic NUMA balancing. The existance of such a switch was and is very important when debugging problems related to transparent hugepages and we should have the same for automatic NUMA placement. Signed-off-by: Mel Gorman <mgorman@suse.de>	2012-12-11 14:42:55 +00:00
Andrea Arcangeli	be3a728427	mm: numa: pte_numa() and pmd_numa() Implement pte_numa and pmd_numa. We must atomically set the numa bit and clear the present bit to define a pte_numa or pmd_numa. Once a pte or pmd has been set as pte_numa or pmd_numa, the next time a thread touches a virtual address in the corresponding virtual range, a NUMA hinting page fault will trigger. The NUMA hinting page fault will clear the NUMA bit and set the present bit again to resolve the page fault. The expectation is that a NUMA hinting page fault is used as part of a placement policy that decides if a page should remain on the current node or migrated to a different node. Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Signed-off-by: Mel Gorman <mgorman@suse.de>	2012-12-11 14:42:36 +00:00
Ingo Molnar	630e1e0bcd	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Conflicts: arch/x86/kernel/ptrace.c Pull the latest RCU tree from Paul E. McKenney: " The major features of this series are: 1. A first version of no-callbacks CPUs. This version prohibits offlining CPU 0, but only when enabled via CONFIG_RCU_NOCB_CPU=y. Relaxing this constraint is in progress, but not yet ready for prime time. These commits were posted to LKML at https://lkml.org/lkml/2012/10/30/724, and are at branch rcu/nocb. 2. Changes to SRCU that allows statically initialized srcu_struct structures. These commits were posted to LKML at https://lkml.org/lkml/2012/10/30/296, and are at branch rcu/srcu. 3. Restructuring of RCU's debugfs output. These commits were posted to LKML at https://lkml.org/lkml/2012/10/30/341, and are at branch rcu/tracing. 4. Additional CPU-hotplug/RCU improvements, posted to LKML at https://lkml.org/lkml/2012/10/30/327, and are at branch rcu/hotplug. Note that the commit eliminating __stop_machine() was judged to be too-high of risk, so is deferred to 3.9. 5. Changes to RCU's idle interface, most notably a new module parameter that redirects normal grace-period operations to their expedited equivalents. These were posted to LKML at https://lkml.org/lkml/2012/10/30/739, and are at branch rcu/idle. 6. Additional diagnostics for RCU's CPU stall warning facility, posted to LKML at https://lkml.org/lkml/2012/10/30/315, and are at branch rcu/stall. The most notable change reduces the default RCU CPU stall-warning time from 60 seconds to 21 seconds, so that it once again happens sooner than the softlockup timeout. 7. Documentation updates, which were posted to LKML at https://lkml.org/lkml/2012/10/30/280, and are at branch rcu/doc. A couple of late-breaking changes were posted at https://lkml.org/lkml/2012/11/16/634 and https://lkml.org/lkml/2012/11/16/547. 8. Miscellaneous fixes, which were posted to LKML at https://lkml.org/lkml/2012/10/30/309, along with a late-breaking change posted at Fri, 16 Nov 2012 11:26:25 -0800 with message-ID <20121116192625.GA447@linux.vnet.ibm.com>, but which lkml.org seems to have missed. These are at branch rcu/fixes. 9. Finally, a fix for an lockdep-RCU splat was posted to LKML at https://lkml.org/lkml/2012/11/7/486. This is at rcu/next. " Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-12-03 06:27:05 +01:00
Frederic Weisbecker	91d1aa43d3	context_tracking: New context tracking susbsystem Create a new subsystem that probes on kernel boundaries to keep track of the transitions between level contexts with two basic initial contexts: user or kernel. This is an abstraction of some RCU code that use such tracking to implement its userspace extended quiescent state. We need to pull this up from RCU into this new level of indirection because this tracking is also going to be used to implement an "on demand" generic virtual cputime accounting. A necessary step to shutdown the tick while still accounting the cputime. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Li Zhong <zhong@linux.vnet.ibm.com> Cc: Gilad Ben-Yossef <gilad@benyossef.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org> [ paulmck: fix whitespace error and email address. ] Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-11-30 11:40:07 -08:00
Stephen Warren	d33b98fc82	block: partition: msdos: provide UUIDs for partitions The MSDOS/MBR partition table includes a 32-bit unique ID, often referred to as the NT disk signature. When combined with a partition number within the table, this can form a unique ID similar in concept to EFI/GPT's partition UUID. Constructing and recording this value in struct partition_meta_info allows MSDOS partitions to be referred to on the kernel command-line using the following syntax: root=PARTUUID=0002dd75-01 Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Tejun Heo <tj@kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:28:58 +01:00
Stephen Warren	283f8fc039	init: reduce PARTUUID min length to 1 from 36 Reduce the minimum length for a root=PARTUUID= parameter to be considered valid from 36 to 1. EFI/GPT partition UUIDs are always exactly 36 characters long, hence the previous limit. However, the next patch will support DOS/MBR UUIDs too, which have a different, shorter, format. Instead of validating any particular length, just ensure that at least some non-empty value was given by the user. Also, consider a missing UUID value to be a parsing error, in the same vein as if /PARTNROFF exists and can't be parsed. As such, make both error cases print a message and disable rootwait. Convert to pr_err while we're at it. Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Tejun Heo <tj@kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:28:56 +01:00
Stephen Warren	1ad7e89940	block: store partition_meta_info.uuid as a string This will allow other types of UUID to be stored here, aside from true UUIDs. This also simplifies code that uses this field, since it's usually constructed from a, used as a, or compared to other, strings. Note: A simplistic approach here would be to set uuid_str[36]=0 whenever a /PARTNROFF option was found to be present. However, this modifies the input string, and causes subsequent calls to devt_from_partuuid() not to see the /PARTNROFF option, which causes different results. In order to avoid misleading future maintainers, this parameter is marked const. Signed-off-by: Stephen Warren <swarren@nvidia.com> Cc: Tejun Heo <tj@kernel.org> Cc: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-11-23 14:28:53 +01:00
Eric W. Biederman	98f842e675	proc: Usable inode numbers for the namespace file descriptors. Assign a unique proc inode to each namespace, and use that inode number to ensure we only allocate at most one proc inode for every namespace in proc. A single proc inode per namespace allows userspace to test to see if two processes are in the same namespace. This has been a long requested feature and only blocked because a naive implementation would put the id in a global space and would ultimately require having a namespace for the names of namespaces, making migration and certain virtualization tricks impossible. We still don't have per superblock inode numbers for proc, which appears necessary for application unaware checkpoint/restart and migrations (if the application is using namespace file descriptors) but that is now allowd by the design if it becomes important. I have preallocated the ipc and uts initial proc inode numbers so their structures can be statically initialized. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-11-20 04:19:49 -08:00
Eric W. Biederman	1c4042c29b	pidns: Consolidate initialzation of special init task state Instead of setting child_reaper and SIGNAL_UNKILLABLE one way for the system init process, and another way for pid namespace init processes test pid->nr == 1 and use the same code for both. For the global init this results in SIGNAL_UNKILLABLE being set much earlier in the initialization process. This is a small cleanup and it paves the way for allowing unshare and enter of the pid namespace as that path like our global init also will not set CLONE_NEWPID. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-11-19 05:59:15 -08:00
Frederic Weisbecker	74876a98a8	printk: Wake up klogd using irq_work klogd is woken up asynchronously from the tick in order to do it safely. However if printk is called when the tick is stopped, the reader won't be woken up until the next interrupt, which might not fire for a while. As a result, the user may miss some message. To fix this, lets implement the printk tick using a lazy irq work. This subsystem takes care of the timer tick state and can fix up accordingly. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-11-18 01:01:49 +01:00
Frederic Weisbecker	6147a9d807	irq_work: Remove CONFIG_HAVE_IRQ_WORK irq work can run on any arch even without IPI support because of the hook on update_process_times(). So lets remove HAVE_IRQ_WORK because it doesn't reflect any backend requirement. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Paul Gortmaker <paul.gortmaker@windriver.com>	2012-11-17 19:25:12 +01:00
Paul E. McKenney	3fbfbf7a3b	rcu: Add callback-free CPUs RCU callback execution can add significant OS jitter and also can degrade both scheduling latency and, in asymmetric multiprocessors, energy efficiency. This commit therefore adds the ability for selected CPUs ("rcu_nocbs=" boot parameter) to have their callbacks offloaded to kthreads. If the "rcu_nocb_poll" boot parameter is also specified, these kthreads will do polling, removing the need for the offloaded CPUs to do wakeups. At least one CPU must be doing normal callback processing: currently CPU 0 cannot be selected as a no-CBs CPU. In addition, attempts to offline the last normal-CBs CPU will fail. This feature was inspired by Jim Houston's and Joe Korty's JRCU, and this commit includes fixes to problems located by Fengguang Wu's kbuild test robot. [ paulmck: Added gfp.h include file as suggested by Fengguang Wu. ] Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-11-16 10:05:56 -08:00
Eric W. Biederman	499dcf2024	userns: Support fuse interacting with multiple user namespaces Use kuid_t and kgid_t in struct fuse_conn and struct fuse_mount_data. The connection between between a fuse filesystem and a fuse daemon is established when a fuse filesystem is mounted and provided with a file descriptor the fuse daemon created by opening /dev/fuse. For now restrict the communication of uids and gids between the fuse filesystem and the fuse daemon to the initial user namespace. Enforce this by verifying the file descriptor passed to the mount of fuse was opened in the initial user namespace. Ensuring the mount happens in the initial user namespace is not necessary as mounts from non-initial user namespaces are not yet allowed. In fuse_req_init_context convert the currrent fsuid and fsgid into the initial user namespace for the request that will be sent to the fuse daemon. In fuse_fill_attr convert the uid and gid passed from the fuse daemon from the initial user namespace into kuids and kgids. In iattr_to_fattr called from fuse_setattr convert kuids and kgids into the uids and gids in the initial user namespace before passing them to the fuse filesystem. In fuse_change_attributes_common called from fuse_dentry_revalidate, fuse_permission, fuse_geattr, and fuse_setattr, and fuse_iget convert the uid and gid from the fuse daemon into a kuid and a kgid to store on the fuse inode. By default fuse mounts are restricted to task whose uid, suid, and euid matches the fuse user_id and whose gid, sgid, and egid matches the fuse group id. Convert the user_id and group_id mount options into kuids and kgids at mount time, and use uid_eq and gid_eq to compare the in fuse_allow_task. Cc: Miklos Szeredi <miklos@szeredi.hu> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-11-14 22:05:33 -08:00
Eric W. Biederman	45634cd8cb	userns: Support autofs4 interacing with multiple user namespaces Use kuid_t and kgid_t in struct autofs_info and struct autofs_wait_queue. When creating directories and symlinks default the uid and gid of the mount requester to the global root uid and gid. autofs4_wait will update these fields when a mount is requested. When generating autofsv5 packets report the uid and gid of the mount requestor in user namespace of the process that opened the pipe, reporting unmapped uids and gids as overflowuid and overflowgid. In autofs_dev_ioctl_requester return the uid and gid of the last mount requester converted into the calling processes user namespace. When the uid or gid don't map return overflowuid and overflowgid as appropriate, allowing failure to find a mount requester to be distinguished from failure to map a mount requester. The uid and gid mount options specifying the user and group of the root autofs inode are converted into kuid and kgid as they are parsed defaulting to the current uid and current gid of the process that mounts autofs. Mounting of autofs for the present remains confined to processes in the initial user namespace. Cc: Ian Kent <raven@themaw.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-11-14 22:05:32 -08:00
David Howells	eded09ccf5	FRV: gcc-4.1.2 also inlines weak functions gcc-4.1.2 inlines weak functions, which causes FRV to fail when the dummy thread_info_cache_init() gets inlined into start_kernel(). Signed-off-by: David Howells <dhowells@redhat.com>	2012-11-02 13:20:42 +00:00
Jan Beulich	bd52276fa1	x86-64/efi: Use EFI to deal with platform wall clock (again) Other than ix86, x86-64 on EFI so far didn't set the {g,s}et_wallclock accessors to the EFI routines, thus incorrectly using raw RTC accesses instead. Simply removing the #ifdef around the respective code isn't enough, however: While so far early get-time calls were done in physical mode, this doesn't work properly for x86-64, as virtual addresses would still need to be set up for all runtime regions (which wasn't the case on the system I have access to), so instead the patch moves the call to efi_enter_virtual_mode() ahead (which in turn allows to drop all code related to calling efi-get-time in physical mode). Additionally the earlier calling of efi_set_executable() requires the CPA code to cope, i.e. during early boot it must be avoided to call cpa_flush_array(), as the first thing this function does is a BUG_ON(irqs_disabled()). Also make the two EFI functions in question here static - they're not being referenced elsewhere. History: This commit was originally merged as `bacef661ac` ("x86-64/efi: Use EFI to deal with platform wall clock") but it resulted in some ASUS machines no longer booting due to a firmware bug, and so was reverted in `f026cfa82f`. A pre-emptive fix for the buggy ASUS firmware was merged in 03a1c254975e ("x86, efi: 1:1 pagetable mapping for virtual EFI calls") so now this patch can be reapplied. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Matt Fleming <matt.fleming@intel.com> Acked-by: Matthew Garrett <mjg@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Matt Fleming <matt.fleming@intel.com> [added commit history]	2012-10-30 10:39:20 +00:00
Paul Gortmaker	af71befa28	rcu: Wordsmith help text for RCU_USER_QS kernel parameter This commit adds a "try" missing from the end of the first paragraph of the RCU_USER_QS help text. [ paulmck: Also fix up the last paragraph a bit. ] Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-10-24 11:07:09 -07:00
Paul E. McKenney	ba49df4767	rcu: Update RCU_FAST_NO_HZ help text The RCU_FAST_NO_HZ help text included a warning about overhead on large systems, but that issue has since been resolved. The main remaining issue with RCU_FAST_NO_HZ is increased real-time latency. This commit therefore updates the help text accordingly. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-10-23 14:54:07 -07:00
Linus Torvalds	d25282d1c9	Merge branch 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux Pull module signing support from Rusty Russell: "module signing is the highlight, but it's an all-over David Howells frenzy..." Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG. * 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits) X.509: Fix indefinite length element skip error handling X.509: Convert some printk calls to pr_devel asymmetric keys: fix printk format warning MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking MODSIGN: Make mrproper should remove generated files. MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs MODSIGN: Use the same digest for the autogen key sig as for the module sig MODSIGN: Sign modules during the build process MODSIGN: Provide a script for generating a key ID from an X.509 cert MODSIGN: Implement module signature checking MODSIGN: Provide module signing public keys to the kernel MODSIGN: Automatically generate module signing keys if missing MODSIGN: Provide Kconfig options MODSIGN: Provide gitignore and make clean rules for extra files MODSIGN: Add FIPS policy module: signature checking hook X.509: Add a crypto key parser for binary (DER) X.509 certificates MPILIB: Provide a function to read raw data into an MPI X.509: Add an ASN.1 decoder X.509: Add simple ASN.1 grammar compiler ...	2012-10-14 13:39:34 -07:00
Linus Torvalds	4e21fc138b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal Pull third pile of kernel_execve() patches from Al Viro: "The last bits of infrastructure for kernel_thread() et.al., with alpha/arm/x86 use of those. Plus sanitizing the asm glue and do_notify_resume() on alpha, fixing the "disabled irq while running task_work stuff" breakage there. At that point the rest of kernel_thread/kernel_execve/sys_execve work can be done independently for different architectures. The only pending bits that do depend on having all architectures converted are restrictred to fs/* and kernel/* - that'll obviously have to wait for the next cycle. I thought we'd have to wait for all of them done before we start eliminating the longjump-style insanity in kernel_execve(), but it turned out there's a very simple way to do that without flagday-style changes." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal: alpha: switch to saner kernel_execve() semantics arm: switch to saner kernel_execve() semantics x86, um: convert to saner kernel_execve() semantics infrastructure for saner ret_from_kernel_thread semantics make sure that kernel_thread() callbacks call do_exit() themselves make sure that we always have a return path from kernel_execve() ppc: eeh_event should just use kthread_run() don't bother with kernel_thread/kernel_execve for launching linuxrc alpha: get rid of switch_stack argument of do_work_pending() alpha: don't bother passing switch_stack separately from regs alpha: take SIGPENDING/NOTIFY_RESUME loop into signal.c alpha: simplify TIF_NEED_RESCHED handling	2012-10-13 10:05:52 +09:00
Linus Torvalds	8418263e35	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull third pile of VFS updates from Al Viro: "Stuff from Jeff Layton, mostly. Sanitizing interplay between audit and namei, removing a lot of insanity from audit_inode() mess and getting things ready for his ESTALE patchset." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: procfs: don't need a PATH_MAX allocation to hold a string representation of an int vfs: embed struct filename inside of names_cache allocation if possible audit: make audit_inode take struct filename vfs: make path_openat take a struct filename pointer vfs: turn do_path_lookup into wrapper around struct filename variant audit: allow audit code to satisfy getname requests from its names_list vfs: define struct filename and have getname() return it vfs: unexport getname and putname symbols acct: constify the name arg to acct_on vfs: allocate page instead of names_cache buffer in mount_block_root audit: overhaul __audit_inode_child to accomodate retrying audit: optimize audit_compare_dname_path audit: make audit_compare_dname_path use parent_len helper audit: remove dirlen argument to audit_compare_dname_path audit: set the name_len in audit_inode for parent lookups audit: add a new "type" field to audit_names struct audit: reverse arguments to audit_inode_child audit: no need to walk list in audit_inode if name is NULL audit: pass in dentry to audit_copy_inode wherever possible audit: remove unnecessary NULL ptr checks from do_path_lookup	2012-10-13 10:04:42 +09:00
Al Viro	a74fb73c12	infrastructure for saner ret_from_kernel_thread semantics * allow kernel_execve() leave the actual return to userland to caller (selected by CONFIG_GENERIC_KERNEL_EXECVE). Callers updated accordingly. * architecture that does select GENERIC_KERNEL_EXECVE in its Kconfig should have its ret_from_kernel_thread() do this: call schedule_tail call the callback left for it by copy_thread(); if it ever returns, that's because it has just done successful kernel_execve() jump to return from syscall IOW, its only difference from ret_from_fork() is that it does call the callback. * such an architecture should also get rid of ret_from_kernel_execve() and __ARCH_WANT_KERNEL_EXECVE This is the last part of infrastructure patches in that area - from that point on work on different architectures can live independently. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 13:35:07 -04:00
Linus Torvalds	9d55ab71b7	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU fixes from Ingo Molnar: "This tree includes a shutdown/cpu-hotplug deadlock fix and a documentation fix." * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rcu: Advise most users not to enable RCU user mode rcu: Grace-period initialization excludes only RCU notifier	2012-10-12 22:12:07 +09:00
Jeff Layton	a608ca21f5	vfs: allocate page instead of names_cache buffer in mount_block_root First, it's incorrect to call putname() after __getname_gfp() since the bare __getname_gfp() call skips the auditing code, while putname() doesn't. mount_block_root allocates a PATH_MAX buffer via __getname_gfp, and then calls get_fs_names to fill the buffer. That function can call get_filesystem_list which assumes that that buffer is a full page in size. On arches where PAGE_SIZE != 4k, then this could potentially overrun. In practice, it's hard to imagine the list of filesystem names even approaching 4k, but it's best to be safe. Just allocate a page for this purpose instead. With this, we can also remove the __getname_gfp() definition since there are no more callers. Signed-off-by: Jeff Layton <jlayton@redhat.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-12 00:32:03 -04:00
Al Viro	d6b2123802	make sure that we always have a return path from kernel_execve() The only place where kernel_execve() is called without a way to return to the caller of kernel_thread() callback is kernel_post(). Reorganize kernel_init()/kernel_post() - instead of the former calling the latter in the end (and getting freed by it), have the latter begin with calling the former (and turn the latter into kernel_thread() callback, of course). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-11 21:42:35 -04:00
Al Viro	ba4df2808a	don't bother with kernel_thread/kernel_execve for launching linuxrc exec_usermodehelper_fns() will do just fine... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-10-11 21:40:30 -04:00
Frederic Weisbecker	d677124b1f	rcu: Advise most users not to enable RCU user mode Discourage distros from enabling CONFIG_RCU_USER_QS because it brings overhead for no benefits yet. It's not a useful feature on its own until we can fully run an adaptive tickless kernel. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-10-10 17:35:01 -07:00
David Howells	48ba2462ac	MODSIGN: Implement module signature checking Check the signature on the module against the keys compiled into the kernel or available in a hardware key store. Currently, only RSA keys are supported - though that's easy enough to change, and the signature is expected to contain raw components (so not a PGP or PKCS#7 formatted blob). The signature blob is expected to consist of the following pieces in order: (1) The binary identifier for the key. This is expected to match the SubjectKeyIdentifier from an X.509 certificate. Only X.509 type identifiers are currently supported. (2) The signature data, consisting of a series of MPIs in which each is in the format of a 2-byte BE word sizes followed by the content data. (3) A 12 byte information block of the form: struct module_signature { enum pkey_algo algo : 8; enum pkey_hash_algo hash : 8; enum pkey_id_type id_type : 8; u8 __pad; __be32 id_length; __be32 sig_length; }; The three enums are defined in crypto/public_key.h. 'algo' contains the public-key algorithm identifier (0->DSA, 1->RSA). 'hash' contains the digest algorithm identifier (0->MD4, 1->MD5, 2->SHA1, etc.). 'id_type' contains the public-key identifier type (0->PGP, 1->X.509). '__pad' should be 0. 'id_length' should contain in the binary identifier length in BE form. 'sig_length' should contain in the signature data length in BE form. The lengths are in BE order rather than CPU order to make dealing with cross-compilation easier. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (minor Kconfig fix)	2012-10-10 20:06:10 +10:30
David Howells	ea0b6dcf71	MODSIGN: Provide Kconfig options Provide kernel configuration options for module signing. The following configuration options are added: CONFIG_MODULE_SIG_SHA1 CONFIG_MODULE_SIG_SHA224 CONFIG_MODULE_SIG_SHA256 CONFIG_MODULE_SIG_SHA384 CONFIG_MODULE_SIG_SHA512 These select the cryptographic hash used to digest the data prior to signing. Additionally, the crypto module selected will be built into the kernel as it won't be possible to load it as a module without incurring a circular dependency when the kernel tries to check its signature. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-10 20:01:20 +10:30
Rusty Russell	106a4ee258	module: signature checking hook We do a very simple search for a particular string appended to the module (which is cache-hot and about to be SHA'd anyway). There's both a config option and a boot parameter which control whether we accept or fail with unsigned modules and modules that are signed with an unknown key. If module signing is enabled, the kernel will be tainted if a module is loaded that is unsigned or has a signature for which we don't have the key. (Useful feedback and tweaks by David Howells <dhowells@redhat.com>) Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-10 20:00:55 +10:30
Michel Lespinasse	147e615f83	prio_tree: remove After both prio_tree users have been converted to use red-black trees, there is no need to keep around the prio tree library anymore. Signed-off-by: Michel Lespinasse <walken@google.com> Cc: Rik van Riel <riel@redhat.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:40 +09:00
Catalin Marinas	7ac57a89de	Kconfig: clean up the "#if defined(arch)" list for exception-trace sysctl entry Introduce SYSCTL_EXCEPTION_TRACE config option and selec it in the architectures requiring support for the "exception-trace" debug_table entry in kernel/sysctl.c. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:14 +09:00
Catalin Marinas	af1839eb4b	Kconfig: clean up the long arch list for the UID16 config option Introduce HAVE_UID16 config option and select it in corresponding architecture Kconfig files. UID16 now only depends on HAVE_UID16. Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Mikael Starvik <starvik@axis.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-09 16:22:13 +09:00
David Howells	4520c6a49a	X.509: Add simple ASN.1 grammar compiler Add a simple ASN.1 grammar compiler. This produces a bytecode output that can be fed to a decoder to inform the decoder how to interpret the ASN.1 stream it is trying to parse. Action functions can be specified in the grammar by interpolating: ({ foo }) after a type, for example: SubjectPublicKeyInfo ::= SEQUENCE { algorithm AlgorithmIdentifier, subjectPublicKey BIT STRING ({ do_key_data }) } The decoder is expected to call these after matching this type and parsing the contents if it is a constructed type. The grammar compiler does not currently support the SET type (though it does support SET OF) as I can't see a good way of tracking which members have been encountered yet without using up extra stack space. Currently, the grammar compiler will fail if more than 256 bytes of bytecode would be produced or more than 256 actions have been specified as it uses 8-bit jump values and action indices to keep space usage down. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-10-08 13:50:19 +10:30
Alex Kelly	046d662f48	coredump: make core dump functionality optional Adds an expert Kconfig option, CONFIG_COREDUMP, which allows disabling of core dump. This saves approximately 2.6k in the compiled kernel, and complements CONFIG_ELF_CORE, which now depends on it. CONFIG_COREDUMP also disables coredump-related sysctls, except for suid_dumpable and related functions, which are necessary for ptrace. [akpm@linux-foundation.org: fix binfmt_aout.c build] Signed-off-by: Alex Kelly <alex.page.kelly@gmail.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:05:15 +09:00
Andi Kleen	754b7b63d1	sections: disable const sections for PA-RISC v2 The PA-RISC tool chain seems to have some problem with correct read/write attributes on sections. This causes problems when the const sections are fixed up for other architecture to only contain truly read-only data. Disable const sections for PA-RISC This can cause a bit of noise with modpost. Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: "James E.J. Bottomley" <jejb@parisc-linux.org> Cc: Helge Deller <deller@gmx.de> Acked-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-10-06 03:04:37 +09:00
Linus Torvalds	437589a74b	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace changes from Eric Biederman: "This is a mostly modest set of changes to enable basic user namespace support. This allows the code to code to compile with user namespaces enabled and removes the assumption there is only the initial user namespace. Everything is converted except for the most complex of the filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs, nfs, ocfs2 and xfs as those patches need a bit more review. The strategy is to push kuid_t and kgid_t values are far down into subsystems and filesystems as reasonable. Leaving the make_kuid and from_kuid operations to happen at the edge of userspace, as the values come off the disk, and as the values come in from the network. Letting compile type incompatible compile errors (present when user namespaces are enabled) guide me to find the issues. The most tricky areas have been the places where we had an implicit union of uid and gid values and were storing them in an unsigned int. Those places were converted into explicit unions. I made certain to handle those places with simple trivial patches. Out of that work I discovered we have generic interfaces for storing quota by projid. I had never heard of the project identifiers before. Adding full user namespace support for project identifiers accounts for most of the code size growth in my git tree. Ultimately there will be work to relax privlige checks from "capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing root in a user names to do those things that today we only forbid to non-root users because it will confuse suid root applications. While I was pushing kuid_t and kgid_t changes deep into the audit code I made a few other cleanups. I capitalized on the fact we process netlink messages in the context of the message sender. I removed usage of NETLINK_CRED, and started directly using current->tty. Some of these patches have also made it into maintainer trees, with no problems from identical code from different trees showing up in linux-next. After reading through all of this code I feel like I might be able to win a game of kernel trivial pursuit." Fix up some fairly trivial conflicts in netfilter uid/git logging code. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits) userns: Convert the ufs filesystem to use kuid/kgid where appropriate userns: Convert the udf filesystem to use kuid/kgid where appropriate userns: Convert ubifs to use kuid/kgid userns: Convert squashfs to use kuid/kgid where appropriate userns: Convert reiserfs to use kuid and kgid where appropriate userns: Convert jfs to use kuid/kgid where appropriate userns: Convert jffs2 to use kuid and kgid where appropriate userns: Convert hpfs to use kuid and kgid where appropriate userns: Convert btrfs to use kuid/kgid where appropriate userns: Convert bfs to use kuid/kgid where appropriate userns: Convert affs to use kuid/kgid wherwe appropriate userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids userns: On ia64 deal with current_uid and current_gid being kuid and kgid userns: On ppc convert current_uid from a kuid before printing. userns: Convert s390 getting uid and gid system calls to use kuid and kgid userns: Convert s390 hypfs to use kuid and kgid where appropriate userns: Convert binder ipc to use kuids userns: Teach security_path_chown to take kuids and kgids userns: Add user namespace support to IMA userns: Convert EVM to deal with kuids and kgids in it's hmac computation ...	2012-10-02 11:11:09 -07:00
Linus Torvalds	06d2fe153b	Driver core merge for 3.7-rc1 Here is the big driver core update for 3.7-rc1. A number of firmware_class.c updates (as you saw a month or so ago), and some hyper-v updates and some printk fixes as well. All patches that are outside of the drivers/base area have been acked by the respective maintainers, and have all been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) iEYEABECAAYFAlBp3vkACgkQMUfUDdst+ylQoACgldktGFgkCLzH+rGYthrXOC5P 9hUAnjmOhdoHlMTL81vWTlH+BrGernym =khrr -----END PGP SIGNATURE----- Merge tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core merge from Greg Kroah-Hartman: "Here is the big driver core update for 3.7-rc1. A number of firmware_class.c updates (as you saw a month or so ago), and some hyper-v updates and some printk fixes as well. All patches that are outside of the drivers/base area have been acked by the respective maintainers, and have all been in the linux-next tree for a while. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" * tag 'driver-core-3.6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (95 commits) memory: tegra{20,30}-mc: Fix reading incorrect register in mc_readl() device.h: Add missing inline to #ifndef CONFIG_PRINTK dev_vprintk_emit memory: emif: Add ifdef CONFIG_DEBUG_FS guard for emif_debugfs_[init\|exit] Documentation: Fixes some translation error in Documentation/zh_CN/gpio.txt Documentation: Remove 3 byte redundant code at the head of the Documentation/zh_CN/arm/booting Documentation: Chinese translation of Documentation/video4linux/omap3isp.txt device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit dev: Add dev_vprintk_emit and dev_printk_emit netdev_printk/netif_printk: Remove a superfluous logging colon netdev_printk/dynamic_netdev_dbg: Directly call printk_emit dev_dbg/dynamic_debug: Update to use printk_emit, optimize stack driver-core: Shut up dev_dbg_reatelimited() without DEBUG tools/hv: Parse /etc/os-release tools/hv: Check for read/write errors tools/hv: Fix exit() error code tools/hv: Fix file handle leak Tools: hv: Implement the KVP verb - KVP_OP_GET_IP_INFO Tools: hv: Rename the function kvp_get_ip_address() Tools: hv: Implement the KVP verb - KVP_OP_SET_IP_INFO Tools: hv: Add an example script to configure an interface ...	2012-10-01 12:10:44 -07:00
Linus Torvalds	81f56e5375	Linux support for the 64-bit ARM architecture (AArch64) Features currently supported: - 39-bit address space for user and kernel (each) - 4KB and 64KB page configurations - Compat (32-bit) user applications (ARMv7, EABI only) - Flattened Device Tree (mandated for all AArch64 platforms) - ARM generic timers -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iQIcBAABAgAGBQJQabRiAAoJEGvWsS0AyF7xXgcQAK+FTXt0ikdQYMkV5AIZXb9i xHRhuiZWx2vKyk0mCqpyGLY58GSmSb6uTBg/2P2Ej7vXdH/RB2goPzjlspfjkDL4 o8RJp7eQ07Uz3KRDYEJgMP8xKZid6KFG93RJ6TjjpKZLuDBdwiG1GP1vb0jVcWfo ttZrj/aI8lMcqrh3Vq5qefP7GWP1OVATqeaGTiT7oo38pXwF3t237xfBr2iDGFBp ZgIRddrxpa7JYUesfJDDDdGHvLq7Vh2jJV+io9qasBZDrtppGJIhZ0vUni2DgIi7 r4i1LcynDN4JaG0maZ4U/YQm74TCD4BqxV8GJ7zwLPTWeN+of+skjhPSLOkA+0fp I+sWjXlv200gDfJZ9qnUld2kFpoDfJi2b7fNDouSDd2OhmVOVWG3jnVP4Z7meVSb O8BYzWDdsAiabuwciUY3OsmW6424lT93b2v86Vncs4unKMvEjOPxYZbUxhqX8f2j gsmWwwD/yS4THx2B6OyW9VT3I5J6miqs2Glt/GG6vPWT5AKQJn9jCxKaBGhPMPIs xe5/GycBYjdk/Y8qRjegxFbEqzQuiRzmkeFn5jwjmBLqpGNbZDpvMaL6adhAKM5/ v6UIKa91ra4fC9N0h6G61pOc9N9DbT8wPbCbdYY0RMTMRuLDZDgAM3Bvz0r2APdD 96leNy6vx684hbkCSLJs =buJB -----END PGP SIGNATURE----- Merge tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64 Pull arm64 support from Catalin Marinas: "Linux support for the 64-bit ARM architecture (AArch64) Features currently supported: - 39-bit address space for user and kernel (each) - 4KB and 64KB page configurations - Compat (32-bit) user applications (ARMv7, EABI only) - Flattened Device Tree (mandated for all AArch64 platforms) - ARM generic timers" * tag 'arm64-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64: (35 commits) arm64: ptrace: remove obsolete ptrace request numbers from user headers arm64: Do not set the SMP/nAMP processor bit arm64: MAINTAINERS update arm64: Build infrastructure arm64: Miscellaneous header files arm64: Generic timers support arm64: Loadable modules arm64: Miscellaneous library functions arm64: Performance counters support arm64: Add support for /proc/sys/debug/exception-trace arm64: Debugging support arm64: Floating point and SIMD arm64: 32-bit (compat) applications support arm64: User access library functions arm64: Signal handling support arm64: VDSO support arm64: System calls handling arm64: ELF definitions arm64: SMP support arm64: DMA mapping API ...	2012-10-01 11:51:57 -07:00
Linus Torvalds	3b29b03a46	Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/EFI changes from Ingo Molnar: "EFI loader robustness enhancements plus smaller fixes" * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi: Fix the ACPI BGRT driver for images located in EFI boot services memory efi: Add a function to look up existing IO memory mappings efi: Defer freeing boot services memory until after ACPI init x86, EFI: Calculate the EFI framebuffer size instead of trusting the firmware efifb: Skip DMI checks if the bootloader knows what it's doing efi: initialize efi.runtime_version to make query_variable_info/update_capsule workable efi: Build EFI stub with EFI-appropriate options X86: Improve GOP detection in the EFI boot stub	2012-10-01 11:08:12 -07:00
Linus Torvalds	0b981cb94b	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes from Ingo Molnar: "Continued quest to clean up and enhance the cputime code by Frederic Weisbecker, in preparation for future tickless kernel features. Other than that, smallish changes." Fix up trivial conflicts due to additions next to each other in arch/{x86/}Kconfig * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) cputime: Make finegrained irqtime accounting generally available cputime: Gather time/stats accounting config options into a single menu ia64: Reuse system and user vtime accounting functions on task switch ia64: Consolidate user vtime accounting vtime: Consolidate system/idle context detection cputime: Use a proper subsystem naming for vtime related APIs sched: cpu_power: enable ARCH_POWER sched/nohz: Clean up select_nohz_load_balancer() sched: Fix load avg vs. cpu-hotplug sched: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW sched: Fix nohz_idle_balance() sched: Remove useless code in yield_to() sched: Add time unit suffix to sched sysctl knobs sched/debug: Limit sd->*_idx range on sysctl sched: Remove AFFINE_WAKEUPS feature flag s390: Remove leftover account_tick_vtime() header cputime: Consolidate vtime handling on context switch sched: Move cputime code to its own file cputime: Generalize CONFIG_VIRT_CPU_ACCOUNTING tile: Remove SD_PREFER_LOCAL leftover ...	2012-10-01 10:43:39 -07:00
Josh Triplett	2223af3890	efi: Fix the ACPI BGRT driver for images located in EFI boot services memory The ACPI BGRT driver accesses the BIOS logo image when it initializes. However, ACPI 5.0 (which introduces the BGRT) recommends putting the logo image in EFI boot services memory, so that the OS can reclaim that memory. Production systems follow this recommendation, breaking the ACPI BGRT driver. Move the bulk of the BGRT code to run during a new EFI late initialization phase, which occurs after switching EFI to virtual mode, and after initializing ACPI, but before freeing boot services memory. Copy the BIOS logo image to kernel memory at that point, and make it accessible to the BGRT driver. Rework the existing ACPI BGRT driver to act as a simple wrapper exposing that image (and the properties from the BGRT) via sysfs. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Link: http://lkml.kernel.org/r/93ce9f823f1c1f3bb88bdd662cce08eee7a17f5d.1348876882.git.josh@joshtriplett.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-09-29 12:21:03 -07:00
Josh Triplett	785107923a	efi: Defer freeing boot services memory until after ACPI init Some new ACPI 5.0 tables reference resources stored in boot services memory, so keep that memory around until we have ACPI and can extract data from it. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Link: http://lkml.kernel.org/r/baaa6d44bdc4eb0c58e5d1b4ccd2c729f854ac55.1348876882.git.josh@joshtriplett.org Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-09-29 12:21:01 -07:00
Frederic Weisbecker	1fd2b4425a	rcu: Userspace RCU extended QS selftest Provide a config option that enables the userspace RCU extended quiescent state on every CPUs by default. This is for testing purpose. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2012-09-26 15:47:16 +02:00
Frederic Weisbecker	2b1d5024e1	rcu: Settle config for userspace extended quiescent state Create a new config option under the RCU menu that put CPUs under RCU extended quiescent state (as in dynticks idle mode) when they run in userspace. This require some contribution from architectures to hook into kernel and userspace boundaries. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Alessio Igor Bogani <abogani@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Avi Kivity <avi@redhat.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Christoph Lameter <cl@linux.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Gilad Ben Yossef <gilad@benyossef.com> Cc: Hakan Akkan <hakanakkan@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Kevin Hilman <khilman@ti.com> Cc: Max Krasnyansky <maxk@qualcomm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sven-Thorsten Dietrich <thebigcorporation@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2012-09-26 15:44:04 +02:00
Frederic Weisbecker	fdf9c35650	cputime: Make finegrained irqtime accounting generally available There is no known reason for this option to be unavailable on other archs than x86. They just need to call enable_sched_clock_irqtime() if they have a sufficiently finegrained clock to make it working. Move it to the general option and let the user choose between it and pure tick based or virtual cputime accounting. Note that virtual cputime accounting already performs a finegrained irqtime accounting. CONFIG_IRQ_TIME_ACCOUNTING is a kind of middle ground between tick and virtual based accounting. So CONFIG_IRQ_TIME_ACCOUNTING and CONFIG_VIRT_CPU_ACCOUNTING are mutually exclusive choices. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>	2012-09-25 16:01:36 +02:00
Frederic Weisbecker	391dc69c68	cputime: Gather time/stats accounting config options into a single menu This debloats a bit the general config menu and make these config options easier to find. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>	2012-09-25 15:43:00 +02:00
Eric W. Biederman	7223546586	userns: Convert the ufs filesystem to use kuid/kgid where appropriate Cc: Evgeniy Dushistov <dushistov@mail.ru> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 04:28:00 -07:00
Eric W. Biederman	c2ba138a27	userns: Convert the udf filesystem to use kuid/kgid where appropriate Cc: Jan Kara <jack@suse.cz> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 04:18:54 -07:00
Eric W. Biederman	39241beb78	userns: Convert ubifs to use kuid/kgid Cc: Artem Bityutskiy <dedekind1@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:36 -07:00
Eric W. Biederman	61293ee274	userns: Convert squashfs to use kuid/kgid where appropriate Cc: Phillip Lougher <phillip@squashfs.org.uk> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:35 -07:00
Eric W. Biederman	df814654f3	userns: Convert reiserfs to use kuid and kgid where appropriate Cc: reiserfs-devel@vger.kernel.org Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:34 -07:00
Eric W. Biederman	c18cdc1a3e	userns: Convert jfs to use kuid/kgid where appropriate Cc: Dave Kleikamp <shaggy@kernel.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:33 -07:00
Eric W. Biederman	0cfe53d3c3	userns: Convert jffs2 to use kuid and kgid where appropriate - General routine uid/gid conversion work - When storing posix acls treat ACL_USER and ACL_GROUP separately so I can call from_kuid or from_kgid as appropriate. - When reading posix acls treat ACL_USER and ACL_GROUP separately so I can call make_kuid or make_kgid as appropriate. Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:33 -07:00
Eric W. Biederman	0e1a43c716	userns: Convert hpfs to use kuid and kgid where appropriate Cc: Mikulas Patocka <mikulas@artax.karlin.mff.cuni.cz> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:32 -07:00
Eric W. Biederman	2f2f43d3c7	userns: Convert btrfs to use kuid/kgid where appropriate Cc: Chris Mason <chris.mason@fusionio.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:31 -07:00
Eric W. Biederman	7f5b82b835	userns: Convert bfs to use kuid/kgid where appropriate Cc: "Tigran A. Aivazian" <tigran@aivazian.fsnet.co.uk> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:31 -07:00
Eric W. Biederman	8fed10be00	userns: Convert affs to use kuid/kgid wherwe appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:30 -07:00
Eric W. Biederman	4a2ebb93bf	userns: Convert binder ipc to use kuids Cc: Arve Hjønnevåg <arve@android.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:26 -07:00
Eric W. Biederman	8b94eea4bf	userns: Add user namespace support to IMA Use kuid's in the IMA rules. When reporting the current uid in audit logs use from_kuid to get a usable value. Cc: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:24 -07:00
Eric W. Biederman	cf9c93526f	userns: Convert EVM to deal with kuids and kgids in it's hmac computation Cc: Mimi Zohar <zohar@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:24 -07:00
Eric W. Biederman	29f82ae56e	userns: Convert hostfs to use kuid and kgid where appropriate Cc: Jeff Dike <jdike@addtoit.com> Cc: Richard Weinberger <richard@nod.at> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:23 -07:00
Eric W. Biederman	609fcd1b3a	userns: Convert tomoyo to use kuid and kgid where appropriate Acked-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:22 -07:00
Eric W. Biederman	2db8145293	userns: Convert apparmor to use kuid and kgid where appropriate Cc: John Johansen <john.johansen@canonical.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:21 -07:00
Eric W. Biederman	e4849737f7	userns: Convert loop to use kuid_t instead of uid_t Cc: Jens Axboe <jaxboe@fusionio.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:20 -07:00
Eric W. Biederman	d03ca5820d	userns: Convert ipathfs to use GLOBAL_ROOT_UID and GLOBAL_ROOT_GID Acked-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:19 -07:00
Eric W. Biederman	2f83ffa874	userns: Convert freevxfs to use kuid/kgid where appropriate Cc: Christoph Hellwig <hch@infradead.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:19 -07:00
Eric W. Biederman	a726ecce75	userns: Convert the sysv filesystem to use kuid/kgid where appropriate Cc: Christoph Hellwig <hch@infradead.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:18 -07:00
Eric W. Biederman	85a03d1bba	userns: Convert the qnx6 filesystem to use kuid/kgid where appropriate Cc: Kai Bankett <chaosman@ontika.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:17 -07:00
Eric W. Biederman	511728d778	userns: Convert the qnx4 filesystem to use kuid/kgid where appropriate Acked-by: Anders Larsen <al@alarsen.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:17 -07:00
Eric W. Biederman	80fcbe751f	userns: Convert omfs to use kuid and kgid where appropriate Acked-by: Bob Copeland <me@bobcopeland.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:16 -07:00
Eric W. Biederman	b29f7751c9	userns: Convert ntfs to use kuid and kgid where appropriate Cc: Anton Altaparmakov <anton@tuxera.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:15 -07:00
Eric W. Biederman	305d3d0dbc	userns: Convert nillfs2 to use kuid/kgid where appropriate Acked-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:15 -07:00
Eric W. Biederman	f303bdc55e	userns: Convert minix to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:14 -07:00
Eric W. Biederman	1a0a994ebe	userns: Convert logfs to use kuid/kgid where appropriate Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:13 -07:00
Eric W. Biederman	ba64e2b9e3	userns: Convert isofs to use kuid/kgid where appropriate Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:12 -07:00
Eric W. Biederman	16525e3f14	userns: Convert hfsplus to use kuid and kgid where appropriate Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:12 -07:00
Eric W. Biederman	43b5e4ccd4	userns: Convert hfs to use kuid and kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:11 -07:00
Eric W. Biederman	d001b05365	userns: Convert exofs to use kuid/kgid where appropriate Cc: Benny Halevy <bhalevy@tonian.com> Acked-by: Boaz Harrosh <bharrosh@panasas.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:10 -07:00
Eric W. Biederman	5d4ea4da6a	userns: Convert efs to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:10 -07:00
Eric W. Biederman	cdf8c58a35	userns: Convert ecryptfs to use kuid/kgid where appropriate Cc: Tyler Hicks <tyhicks@canonical.com> Cc: Dustin Kirkland <dustin.kirkland@gazzang.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:09 -07:00
Eric W. Biederman	a7d9cfe97b	userns: Convert cramfs to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:08 -07:00
Eric W. Biederman	31aba059bb	userns: Convert befs to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:08 -07:00
Eric W. Biederman	c010d1ff4f	userns: Convert adfs to use kuid and kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:07 -07:00
Eric W. Biederman	9a11f4513c	userns: Convert xenfs to use kuid and kgid where appropriate Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:06 -07:00
Eric W. Biederman	a0eb3a05a8	userns: Convert hugetlbfs to use kuid/kgid where appropriate Note sysctl_hugetlb_shm_group can only be written in the root user in the initial user namespace, so we can assume sysctl_hugetlb_shm_group is in the initial user namespace. Cc: William Irwin <wli@holomorphy.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:05 -07:00
Eric W. Biederman	91fa2ccaa8	userns: Convert devtmpfs to use GLOBAL_ROOT_UID and GLOBAL_ROOT_GID Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:13:05 -07:00
Eric W. Biederman	b9b73f7c4d	userns: Convert usb functionfs to use kuid/kgid where appropriate Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Felipe Balbi <balbi@ti.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:12:51 -07:00
Eric W. Biederman	32d639c66e	userns: Convert gadgetfs to use kuid and kgid where appropriate Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Felipe Balbi <balbi@ti.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-21 03:12:16 -07:00
Eric W. Biederman	170782eb89	userns: Convert fat to use kuid/kgid where appropriate Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-20 06:11:55 -07:00
Eric W. Biederman	1a06d420ce	userns: Convert quota Now that the type changes are done, here is the final set of changes to make the quota code work when user namespaces are enabled. Small cleanups and fixes to make the code build when user namespaces are enabled. Cc: Jan Kara <jack@suse.cz> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-09-18 01:01:42 -07:00
Eric W. Biederman	431f19744d	userns: Convert quota netlink aka quota_send_warning Modify quota_send_warning to take struct kqid instead a type and identifier pair. When sending netlink broadcasts always convert uids and quota identifiers into the intial user namespace. There is as yet no way to send a netlink broadcast message with different contents to receivers in different namespaces, so for the time being just map all of the identifiers into the initial user namespace which preserves the current behavior. Change the callers of quota_send_warning in gfs2, xfs and dquot to generate a struct kqid to pass to quota send warning. When all of the user namespaces convesions are complete a struct kqid values will be availbe without need for conversion, but a conversion is needed now to avoid needing to convert everything at once. Cc: Ben Myers <bpm@sgi.com> Cc: Alex Elder <elder@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Jan Kara <jack@suse.cz> Cc: Steven Whitehouse <swhiteho@redhat.com> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-09-18 01:01:40 -07:00
Eric W. Biederman	74a8a10378	userns: Convert qutoactl Update the quotactl user space interface to successfull compile with user namespaces support enabled and to hand off quota identifiers to lower layers of the kernel in struct kqid instead of type and qid pairs. The quota on function is not converted because while it takes a quota type and an id. The id is the on disk quota format to use, which is something completely different. The signature of two struct quotactl_ops methods were changed to take struct kqid argumetns get_dqblk and set_dqblk. The dquot, xfs, and ocfs2 implementations of get_dqblk and set_dqblk are minimally changed so that the code continues to work with the change in parameter type. This is the first in a series of changes to always store quota identifiers in the kernel in struct kqid and only use raw type and qid values when interacting with on disk structures or userspace. Always using struct kqid internally makes it hard to miss places that need conversion to or from the kernel internal values. Cc: Jan Kara <jack@suse.cz> Cc: Dave Chinner <david@fromorbit.com> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: Alex Elder <elder@kernel.org> Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-09-18 01:01:39 -07:00
Eric W. Biederman	69552c0c50	userns: Convert configfs to use kuid and kgid where appropriate Cc: Joel Becker <jlbec@evilplan.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:37 -07:00
Eric W. Biederman	af84df93ff	userns: Convert extN to support kuids and kgids in posix acls Convert ext2, ext3, and ext4 to fully support the posix acl changes, using e_uid e_gid instead e_id. Enabled building with posix acls enabled, all filesystems supporting user namespaces, now also support posix acls when user namespaces are enabled. Cc: Theodore Tso <tytso@mit.edu> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jan Kara <jack@suse.cz> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:36 -07:00
Eric W. Biederman	d20b92ab66	userns: Teach trace to use from_kuid - When tracing capture the kuid. - When displaying the data to user space convert the kuid into the user namespace of the process that opened the report file. Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:34 -07:00
Eric W. Biederman	f8f3d4de2d	userns: Convert bsd process accounting to use kuid and kgid where appropriate BSD process accounting conveniently passes the file the accounting records will be written into to do_acct_process. The file credentials captured the user namespace of the opener of the file. Use the file credentials to format the uid and the gid of the current process into the user namespace of the user that started the bsd process accounting. Cc: Pavel Emelyanov <xemul@openvz.org> Reviewed-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:33 -07:00
Eric W. Biederman	4bd6e32ace	userns: Convert taskstats to handle the user and pid namespaces. - Explicitly limit exit task stat broadcast to the initial user and pid namespaces, as it is already limited to the initial network namespace. - For broadcast task stats explicitly generate all of the idenitiers in terms of the initial user namespace and the initial pid namespace. - For request stats report them in terms of the current user namespace and the current pid namespace. Netlink messages are delivered syncrhonously to the kernel allowing us to get the user namespace and the pid namespace from the current task. - Pass the namespaces for representing pids and uids and gids into bacct_add_task. Cc: Balbir Singh <bsingharora@gmail.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:01:32 -07:00
Eric W. Biederman	cca080d9b6	userns: Convert audit to work with user namespaces enabled - Explicitly format uids gids in audit messges in the initial user namespace. This is safe because auditd is restrected to be in the initial user namespace. - Convert audit_sig_uid into a kuid_t. - Enable building the audit code and user namespaces at the same time. The net result is that the audit subsystem now uses kuid_t and kgid_t whenever possible making it almost impossible to confuse a raw uid_t with a kuid_t preventing bugs. Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-18 01:00:26 -07:00
Catalin Marinas	8c2c3df31e	arm64: Build infrastructure This patch adds Makefile and Kconfig files required for building an AArch64 kernel. Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Tony Lindgren <tony@atomide.com> Acked-by: Nicolas Pitre <nico@linaro.org> Acked-by: Olof Johansson <olof@lixom.net> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Acked-by: Arnd Bergmann <arnd@arndb.de>	2012-09-17 13:42:21 +01:00
Eric W. Biederman	c6089735e7	userns: net: Call key_alloc with GLOBAL_ROOT_UID, GLOBAL_ROOT_GID instead of 0, 0 In net/dns_resolver/dns_key.c and net/rxrpc/ar-key.c make them work with user namespaces enabled where key_alloc takes kuids and kgids. Pass GLOBAL_ROOT_UID and GLOBAL_ROOT_GID instead of bare 0's. Cc: Sage Weil <sage@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: David Howells <dhowells@redhat.com> Cc: David Miller <davem@davemloft.net> Cc: linux-afs@lists.infradead.org Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-13 18:28:04 -07:00
Eric W. Biederman	9a56c2db49	userns: Convert security/keys to the new userns infrastructure - Replace key_user ->user_ns equality checks with kuid_has_mapping checks. - Use from_kuid to generate key descriptions - Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t - Avoid potential problems with file descriptor passing by displaying keys in the user namespace of the opener of key status proc files. Cc: linux-security-module@vger.kernel.org Cc: keyrings@linux-nfs.org Cc: David Howells <dhowells@redhat.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-13 18:28:02 -07:00
Eric W. Biederman	5fce5e0bbd	userns: Convert drm to use kuid and kgid and struct pid where appropriate Blink Blink this had not been converted to use struct pid ages ago? - On drm open capture the openers kuid and struct pid. - On drm close release the kuid and struct pid - When reporting the uid and pid convert the kuid and struct pid into values in the appropriate namespace. Cc: dri-devel@lists.freedesktop.org Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-13 14:32:24 -07:00
Eric W. Biederman	1efdb69b0b	userns: Convert ipc to use kuid and kgid where appropriate - Store the ipc owner and creator with a kuid - Store the ipc group and the crators group with a kgid. - Add error handling to ipc_update_perms, allowing it to fail if the uids and gids can not be converted to kuids or kgids. - Modify the proc files to display the ipc creator and owner in the user namespace of the opener of the proc file. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-06 22:17:20 -07:00
Eric W. Biederman	9582d90196	userns: Convert process event connector to handle kuids and kgids - Only allow asking for events from the initial user and pid namespace, where we generate the events in. - Convert kuids and kgids into the initial user namespace to report them via the process event connector. Cc: David Miller <davem@davemloft.net> Acked-by: Evgeniy Polyakov <zbr@ioremap.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-06 19:37:10 -07:00
Eric W. Biederman	7dc05881b6	userns: Convert debugfs to use kuid/kgid where appropriate. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-09-06 19:02:52 -07:00
Greg Kroah-Hartman	45f035ab9b	CONFIG_HOTPLUG should be always on CONFIG_HOTPLUG is a very old option, back when we had static systems and it was odd that any type of device would be removed or added after the system had started up. It is quite hard to disable it these days, and even if you do, it only saves you about 200 bytes. However, if it is disabled, lots of bugs show up because it is almost never tested if the option is disabled. This is a step to eventually just remove the option entirely, which will clean up all of the devinit* variable and function pointer options, that everyone (myself include) ends up getting wrong eventually, causing real problems when memory segments are removed yet we don't expect them to be. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Bjorn Helgaas <bhelgaas@google.com>	2012-09-06 13:26:16 -07:00
Ingo Molnar	59f979455d	Merge branch 'sched/urgent' into sched/core Merge in the current fixes branch, we are going to apply dependent patches. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-09-04 14:31:00 +02:00
Eric W. Biederman	c9235f4872	userns: Make credential debugging user namespace safe. Cc: David Howells <dhowells@redhat.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-23 22:54:18 -07:00
Eric W. Biederman	bc45dae323	userns: Enable building of pf_key sockets when user namespace support is enabled. Enable building of pf_key sockets and user namespace support at the same time. This combination builds successfully so there is no reason to forbid it. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>	2012-08-23 22:52:54 -07:00
Frederic Weisbecker	b952741c80	cputime: Generalize CONFIG_VIRT_CPU_ACCOUNTING S390, ia64 and powerpc all define their own version of CONFIG_VIRT_CPU_ACCOUNTING. Generalize the config and its description to a single place to avoid duplication. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Fenghua Yu <fenghua.yu@intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org>	2012-08-17 16:31:08 +02:00
Eric W. Biederman	0625c883bc	userns: Convert tun/tap to use kuid and kgid where appropriate Cc: Maxim Krasnyansky <maxk@qualcomm.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:31 -07:00
Eric W. Biederman	1efa29cd41	userns: Make the airo wireless driver use kuids for proc uids and gids Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: John W. Linville <linville@tuxdriver.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:31 -07:00
Eric W. Biederman	26711a791e	userns: xt_owner: Add basic user namespace support. - Only allow adding matches from the initial user namespace - Add the appropriate conversion functions to handle matches against sockets in other user namespaces. Cc: Jan Engelhardt <jengelh@medozas.de> Cc: Patrick McHardy <kaber@trash.net> Cc: Pablo Neira Ayuso <pablo@netfilter.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:30 -07:00
Eric W. Biederman	da7428080a	userns xt_recent: Specify the owner/group of ip_list_perms in the initial user namespace xt_recent creates a bunch of proc files and initializes their uid and gids to the values of ip_list_uid and ip_list_gid. When initialize those proc files convert those values to kuids so they can continue to reside on the /proc inode. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Jan Engelhardt <jengelh@medozas.de> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:29 -07:00
Eric W. Biederman	8c6e2a941a	userns: Convert xt_LOG to print socket kuids and kgids as uids and gids xt_LOG always writes messages via sb_add via printk. Therefore when xt_LOG logs the uid and gid of a socket a packet came from the values should be converted to be in the initial user namespace. Thus making xt_LOG as user namespace safe as possible. Cc: Pablo Neira Ayuso <pablo@netfilter.org> Cc: Patrick McHardy <kaber@trash.net> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:29 -07:00
Eric W. Biederman	a6c6796c71	userns: Convert cls_flow to work with user namespaces enabled The flow classifier can use uids and gids of the sockets that are transmitting packets and do insert those uids and gids into the packet classification calcuation. I don't fully understand the details but it appears that we can depend on specific uids and gids when making traffic classification decisions. To work with user namespaces enabled map from kuids and kgids into uids and gids in the initial user namespace giving raw integer values the code can play with and depend on. To avoid issues of userspace depending on uids and gids in packet classifiers installed from other user namespaces and getting confused deny all packet classifiers that use uids or gids that are not comming from a netlink socket in the initial user namespace. Cc: Patrick McHardy <kaber@trash.net> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Changli Gao <xiaosuo@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:28 -07:00
Eric W. Biederman	9eea9515cb	userns: nfnetlink_log: Report socket uids in the log sockets user namespace At logging instance creation capture the peer netlink socket's user namespace. Use the captured peer user namespace when reporting socket uids to the peer. The peer socket's user namespace is guaranateed to be valid until the user closes the netlink socket. nfnetlink_log removes instances during the final close of a socket. __build_packet_message does not get called after an instance is destroyed. Therefore it is safe to let the peer netlink socket take care of the user namespace reference counting for us. Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:27 -07:00
Eric W. Biederman	d06ca95643	userns: Teach inet_diag to work with user namespaces Compute the user namespace of the socket that we are replying to and translate the kuids of reported sockets into that user namespace. Cc: Andrew Vagin <avagin@openvz.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Pavel Emelyanov <xemul@parallels.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:55:20 -07:00
Eric W. Biederman	d13fda8564	userns: Convert net/ax25 to use kuid_t where appropriate Cc: Ralf Baechle <ralf@linux-mips.org> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:42 -07:00
Eric W. Biederman	4f82f45730	net ip6 flowlabel: Make owner a union of struct pid * and kuid_t Correct a long standing omission and use struct pid in the owner field of struct ip6_flowlabel when the share type is IPV6_FL_S_PROCESS. This guarantees we don't have issues when pid wraparound occurs. Use a kuid_t in the owner field of struct ip6_flowlabel when the share type is IPV6_FL_S_USER to add user namespace support. In /proc/net/ip6_flowlabel capture the current pid namespace when opening the file and release the pid namespace when the file is closed ensuring we print the pid owner value that is meaning to the reader of the file. Similarly use from_kuid_munged to print uid values that are meaningful to the reader of the file. This requires exporting pid_nr_ns so that ipv6 can continue to built as a module. Yoiks what silliness Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:25 -07:00
Eric W. Biederman	7064d16e16	userns: Use kgids for sysctl_ping_group_range - Store sysctl_ping_group_range as a paire of kgid_t values instead of a pair of gid_t values. - Move the kgid conversion work from ping_init_sock into ipv4_ping_group_range - For invalid cases reset to the default disabled state. With the kgid_t conversion made part of the original value sanitation from userspace understand how the code will react becomes clearer and it becomes possible to set the sysctl ping group range from something other than the initial user namespace. Cc: Vasiliy Kulikov <segoon@openwall.com> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:49:10 -07:00
Eric W. Biederman	a7cb5a49bf	userns: Print out socket uids in a user namespace aware fashion. Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: James Morris <jmorris@namei.org> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Cc: Patrick McHardy <kaber@trash.net> Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net> Cc: Sridhar Samudrala <sri@us.ibm.com> Acked-by: Vlad Yasevich <vyasevich@gmail.com> Acked-by: David S. Miller <davem@davemloft.net> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:48:06 -07:00
Eric W. Biederman	fc5795c8a9	userns: Allow USER_NS and NET simultaneously in Kconfig Now that the networking core is user namespace safe allow networking and user namespaces to be built at the same time. Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-14 21:47:45 -07:00
H. Peter Anvin	f026cfa82f	Revert "x86-64/efi: Use EFI to deal with platform wall clock" This reverts commit `bacef661ac`. This commit has been found to cause serious regressions on a number of ASUS machines at the least. We probably need to provide a 1:1 map in addition to the EFI virtual memory map in order for this to work. Signed-off-by: H. Peter Anvin <hpa@zytor.com> Reported-and-bisected-by: Jérôme Carretero <cJ-ko@zougloub.eu> Cc: Jan Beulich <jbeulich@suse.com> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Matthew Garrett <mjg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20120805172903.5f8bb24c@zougloub.eu	2012-08-14 09:58:25 -07:00
Eric W. Biederman	d755586052	userns: Allow the usernamespace support to build after the removal of usbfs The user namespace code has an explicit "depends on USB_DEVICEFS = n" dependency to prevent building code that is not yet user namespace safe. With the removal of usbfs from the kernel it is now impossible to satisfy the USB_DEFICEFS = n dependency and thus it is impossible to enable user namespace support in 3.5-rc1. So remove the now useless depedency. Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-08-03 08:28:01 -07:00
Jiang Liu	9adb62a5df	mm/hotplug: correctly setup fallback zonelists when creating new pgdat When hotadd_new_pgdat() is called to create new pgdat for a new node, a fallback zonelist should be created for the new node. There's code to try to achieve that in hotadd_new_pgdat() as below: /* * The node we allocated has no zone fallback lists. For avoiding * to access not-initialized zonelist, build here. / mutex_lock(&zonelists_mutex); build_all_zonelists(pgdat, NULL); mutex_unlock(&zonelists_mutex); But it doesn't work as expected. When hotadd_new_pgdat() is called, the new node is still in offline state because node_set_online(nid) hasn't been called yet. And build_all_zonelists() only builds zonelists for online nodes as: for_each_online_node(nid) { pg_data_t pgdat = NODE_DATA(nid); build_zonelists(pgdat); build_zonelist_cache(pgdat); } Though we hope to create zonelist for the new pgdat, but it doesn't. So add a new parameter "pgdat" the build_all_zonelists() to build pgdat for the new pgdat too. Signed-off-by: Jiang Liu <liuj97@gmail.com> Signed-off-by: Xishi Qiu <qiuxishi@huawei.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.cz> Cc: Minchan Kim <minchan@kernel.org> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Yinghai Lu <yinghai@kernel.org> Cc: Tony Luck <tony.luck@intel.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Rientjes <rientjes@google.com> Cc: Keping Chen <chenkeping@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:44 -07:00
Andrew Morton	c255a45805	memcg: rename config variables Sanity: CONFIG_CGROUP_MEM_RES_CTLR -> CONFIG_MEMCG CONFIG_CGROUP_MEM_RES_CTLR_SWAP -> CONFIG_MEMCG_SWAP CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED -> CONFIG_MEMCG_SWAP_ENABLED CONFIG_CGROUP_MEM_RES_CTLR_KMEM -> CONFIG_MEMCG_KMEM [mhocko@suse.cz: fix missed bits] Cc: Glauber Costa <glommer@parallels.com> Acked-by: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Hugh Dickins <hughd@google.com> Cc: Tejun Heo <tj@kernel.org> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:43 -07:00
Aneesh Kumar K.V	2bc64a2046	mm/hugetlb: add new HugeTLB cgroup Implement a new controller that allows us to control HugeTLB allocations. The extension allows to limit the HugeTLB usage per control group and enforces the controller limit during page fault. Since HugeTLB doesn't support page reclaim, enforcing the limit at page fault time implies that, the application will get SIGBUS signal if it tries to access HugeTLB pages beyond its limit. This requires the application to know beforehand how much HugeTLB pages it would require for its use. The charge/uncharge calls will be added to HugeTLB code in later patch. Support for cgroup removal will be added in later patches. [akpm@linux-foundation.org: s/CONFIG_CGROUP_HUGETLB_RES_CTLR/CONFIG_MEMCG_HUGETLB/g] [akpm@linux-foundation.org: s/CONFIG_MEMCG_HUGETLB/CONFIG_CGROUP_HUGETLB/g] Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Cc: David Rientjes <rientjes@google.com> Cc: Hillf Danton <dhillf@gmail.com> Reviewed-by: Michal Hocko <mhocko@suse.cz> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-07-31 18:42:40 -07:00
Linus Torvalds	cea8f46c36	Merge branch 'for-linus' of git://git.linaro.org/people/rmk/linux-arm Pull ARM updates from Russell King: "First ARM push of this merge window, post me coming back from holiday. This is what has been in linux-next for the last few weeks. Not much to say which isn't described by the commit summaries." * 'for-linus' of git://git.linaro.org/people/rmk/linux-arm: (32 commits) ARM: 7463/1: topology: Update cpu_power according to DT information ARM: 7462/1: topology: factorize the update of sibling masks ARM: 7461/1: topology: Add arch_scale_freq_power function ARM: 7456/1: ptrace: provide separate functions for tracing syscall {entry,exit} ARM: 7455/1: audit: move syscall auditing until after ptrace SIGTRAP handling ARM: 7454/1: entry: don't bother with syscall tracing on ret_from_fork path ARM: 7453/1: audit: only allow syscall auditing for pure EABI userspace ARM: 7452/1: delay: allow timer-based delay implementation to be selected ARM: 7451/1: arch timer: implement read_current_timer and get_cycles ARM: 7450/1: dcache: select DCACHE_WORD_ACCESS for little-endian ARMv6+ CPUs ARM: 7449/1: use generic strnlen_user and strncpy_from_user functions ARM: 7448/1: perf: remove arm_perf_pmu_ids global enumeration ARM: 7447/1: rwlocks: remove unused branch labels from trylock routines ARM: 7446/1: spinlock: use ticket algorithm for ARMv6+ locking implementation ARM: 7445/1: mm: update CONTEXTIDR register to contain PID of current process ARM: 7444/1: kernel: add arch-timer C3STOP feature ARM: 7460/1: remove asm/locks.h ARM: 7439/1: head.S: simplify initial page table mapping ARM: 7437/1: zImage: Allow DTB command line concatenation with ATAG_CMDLINE ARM: 7436/1: Do not map the vectors page as write-through on UP systems ...	2012-07-27 15:14:26 -07:00
Linus Torvalds	43a1141b9f	Trivial comment changes to cpumask code. I guess it's getting boring. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJQEdqBAAoJENkgDmzRrbjxMOAP/2M865eLJlTAfBlJI7UfLHGw vvB8GVXDLPVIxeDSVcpnbDPNMmtUvf0us4+omGMRVoUeD+lw9PSHq/Q+i5FM//Bn RhP9H8zilYjU0zH1H2kpXh1IJNeRmWVk/H2uwFhQlyiR0sTgT4TXVc1nAaGInpAb k7R8/t7M+fw7nq8HTMaX754/+RSWfbuaKOY6GSg2H/r5VXfUWftODD+ojZsRF/JM YcriA13qqllVgVUc11uhNkvd8zZDUQeYu3QZNGJiZ0xaxsqBdRxyVw4ES2b6X46N nDLBWIXpzWaaa+9VXbAys3S3wTR4b4oZba6GxFyRj7iZ606+ETRVcfcKV3d/xOnW Q+sj3ddq85Ivd+SvHtiuW5RAZY+DFrb4l4ApBw97QxhuVgZWNg1thWgQQF6dxyKj 6cbwa/2gsKSVo53hOZcEH0yUjPBk4G6/8TRH/KgdvZPH618y+npMtHKEcrbjmjl9 IV1wK9lOt7O0zWuUsn8ao5EfxCKInCqZdBmjSoLigupCLSSfby2m7Cw0MQlcONVd GaheUzVf+L8ZcyJxKz+4Wfy0Oc9+gs/mTqbpAZzTgztNAjpb8c3ZbojqXxpGAwFH N749ix0X5urTH/odcGYVna3sNUtHrjUAWYNeUf+AD/0DIjLYNIREOqhTAttPZRGQ JCT22yNnVHZSm2e8gOR/ =XxF4 -----END PGP SIGNATURE----- Merge tag 'cpumask-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus Pull cpumask changes from Rusty Russell: "Trivial comment changes to cpumask code. I guess it's getting boring." Boring is good. * tag 'cpumask-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: cpumask: cpulist_parse() comments correction init: add comments to keep initcall-names in sync with initcall levels cpumask: add a few comments of cpumask functions	2012-07-27 08:34:16 -07:00
Jim Cromie	96263d2863	init: add comments to keep initcall-names in sync with initcall levels main.c has initcall_level_names[] for parse_args to print in debug messages, add comments to keep them in sync with initcalls defined in init.h. Also add "loadable" into comment re not using *_initcall macros in modules, to disambiguate from kernel/params.c and other builtins. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-07-27 09:29:42 +09:30
Linus Torvalds	0a2fe19ccc	Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pul x86/efi changes from Ingo Molnar: "This tree adds an EFI bootloader handover protocol, which, once supported on the bootloader side, will make bootup faster and might result in simpler bootloaders. The other change activates the EFI wall clock time accessors on x86-64 as well, instead of the legacy RTC readout." * 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, efi: Handover Protocol x86-64/efi: Use EFI to deal with platform wall clock	2012-07-26 13:13:25 -07:00
Al Viro	4a9d4b024a	switch fput to task_work_add ... and schedule_work() for interrupt/kernel_thread callers (and yes, now it is OK to call from interrupt). We are guaranteed that __fput() will be done before we return to userland (or exit). Note that for fput() from a kernel thread we get an async behaviour; it's almost always OK, but sometimes you might need to have __fput() completed before you do anything else. There are two mechanisms for that - a general barrier (flush_delayed_fput()) and explicit __fput_sync(). Both should be used with care (as was the case for fput() from kernel threads all along). See comments in fs/file_table.c for details. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-07-22 23:57:58 +04:00
Will Deacon	8f827a1468	ARM: 7453/1: audit: only allow syscall auditing for pure EABI userspace The audit tools support only EABI userspace and, since there are no AUDIT_ARCH_* defines for the ARM OABI, it makes sense to allow syscall auditing on ARM only for EABI at the moment. Cc: Eric Paris <eparis@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-07-09 17:44:12 +01:00
Borislav Petkov	19efb72fdc	init: Drop initcall level output `9fb48c744b` ("params: add 3rd arg to option handler callback signature") added similar lines to dmesg: initlevel:0=early, 4 registered initcalls initlevel:1=core, 31 registered initcalls initlevel:2=postcore, 11 registered initcalls initlevel:3=arch, 7 registered initcalls initlevel:4=subsys, 40 registered initcalls initlevel:5=fs, 30 registered initcalls initlevel:6=device, 250 registered initcalls initlevel:7=late, 35 registered initcalls but they don't contain any info for the general user staring at dmesg. I'm very doubtful the count of initcalls registered per level helps anyone so drop that output completely. Cc: Jim Cromie <jim.cromie@gmail.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Jason Baron <jbaron@redhat.com> Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-06-08 14:58:14 +09:30
Rusty Russell	ae82fdb140	module_param: stop double-calling parameters. Commit `026cee0086` "params: <level>_initcall-like kernel parameters" set old-style module parameters to level 0. And we call those level 0 calls where we used to, early in start_kernel(). We also loop through the initcall levels and call the levelled module_params before the corresponding initcall. Unfortunately level 0 is early_init(), so we call the standard module_param calls twice. (Turns out most things don't care, but at least ubi.mtd does). Change the level to -1 for standard module_param calls. Reported-by: Benoît Thébaudeau <benoit.thebaudeau@advansee.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Cc: stable@kernel.org	2012-06-08 14:58:13 +09:30
Jan Beulich	bacef661ac	x86-64/efi: Use EFI to deal with platform wall clock Other than ix86, x86-64 on EFI so far didn't set the {g,s}et_wallclock accessors to the EFI routines, thus incorrectly using raw RTC accesses instead. Simply removing the #ifdef around the respective code isn't enough, however: While so far early get-time calls were done in physical mode, this doesn't work properly for x86-64, as virtual addresses would still need to be set up for all runtime regions (which wasn't the case on the system I have access to), so instead the patch moves the call to efi_enter_virtual_mode() ahead (which in turn allows to drop all code related to calling efi-get-time in physical mode). Additionally the earlier calling of efi_set_executable() requires the CPA code to cope, i.e. during early boot it must be avoided to call cpa_flush_array(), as the first thing this function does is a BUG_ON(irqs_disabled()). Also make the two EFI functions in question here static - they're not being referenced elsewhere. Signed-off-by: Jan Beulich <jbeulich@suse.com> Tested-by: Matt Fleming <matt.fleming@intel.com> Acked-by: Matthew Garrett <mjg@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/4FBFBF5F020000780008637F@nat28.tlf.novell.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-06-06 11:48:05 +02:00
Linus Torvalds	08615d7d85	Merge branch 'akpm' (Andrew's patch-bomb) Merge misc patches from Andrew Morton: - the "misc" tree - stuff from all over the map - checkpatch updates - fatfs - kmod changes - procfs - cpumask - UML - kexec - mqueue - rapidio - pidns - some checkpoint-restore feature work. Reluctantly. Most of it delayed a release. I'm still rather worried that we don't have a clear roadmap to completion for this work. * emailed from Andrew Morton <akpm@linux-foundation.org>: (78 patches) kconfig: update compression algorithm info c/r: prctl: add ability to set new mm_struct::exe_file c/r: prctl: extend PR_SET_MM to set up more mm_struct entries c/r: procfs: add arg_start/end, env_start/end and exit_code members to /proc/$pid/stat syscalls, x86: add __NR_kcmp syscall fs, proc: introduce /proc/<pid>/task/<tid>/children entry sysctl: make kernel.ns_last_pid control dependent on CHECKPOINT_RESTORE aio/vfs: cleanup of rw_copy_check_uvector() and compat_rw_copy_check_uvector() eventfd: change int to __u64 in eventfd_signal() fs/nls: add Apple NLS pidns: make killed children autoreap pidns: use task_active_pid_ns in do_notify_parent rapidio/tsi721: add DMA engine support rapidio: add DMA engine support for RIO data transfers ipc/mqueue: add rbtree node caching support tools/selftests: add mq_perf_tests ipc/mqueue: strengthen checks on mqueue creation ipc/mqueue: correct mq_attr_ok test ipc/mqueue: improve performance of send/recv selftests: add mq_open_tests ...	2012-05-31 18:10:18 -07:00
Randy Dunlap	0a4dd35c67	kconfig: update compression algorithm info There have been new compression algorithms added without updating nearby relevant descriptive text that refers to (a) the number of compression algorithms and (b) the most recent one. Fix these inconsistencies. Signed-off-by: Randy Dunlap <rdunlap@xenotime.net> Reported-by: <qasdfgtyuiop@gmail.com> Cc: Lasse Collin <lasse.collin@tukaani.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Alain Knaff <alain@knaff.lu> Cc: Albin Tonnerre <albin.tonnerre@free-electrons.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:33 -07:00
H Hartley Sweeten	c67e5382fb	init: disable sparse checking of the mount.o source files The init/mount.o source files produce a number of sparse warnings of the type: warning: incorrect type in argument 1 (different address spaces) expected char [noderef] <asn:1>dev_name got char name This is due to the syscalls expecting some of the arguments to be user pointers but they are being passed as kernel pointers. This is harmless but adds a lot of noise to a sparse build. To limit the noise just disable the sparse checking in the relevant source files, but still display a warning so that the user knows this has been done. Since the sparse checking has been disabled we can also remove the __user __force casts that are scattered thru the source. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-31 17:49:27 -07:00
Linus Torvalds	0d167518e0	Merge branch 'for-3.5/core' of git://git.kernel.dk/linux-block Merge block/IO core bits from Jens Axboe: "This is a bit bigger on the core side than usual, but that is purely because we decided to hold off on parts of Tejun's submission on 3.4 to give it a bit more time to simmer. As a consequence, it's seen a long cycle in for-next. It contains: - Bug fix from Dan, wrong locking type. - Relax splice gifting restriction from Eric. - A ton of updates from Tejun, primarily for blkcg. This improves the code a lot, making the API nicer and cleaner, and also includes fixes for how we handle and tie policies and re-activate on switches. The changes also include generic bug fixes. - A simple fix from Vivek, along with a fix for doing proper delayed allocation of the blkcg stats." Fix up annoying conflict just due to different merge resolution in Documentation/feature-removal-schedule.txt * 'for-3.5/core' of git://git.kernel.dk/linux-block: (92 commits) blkcg: tg_stats_alloc_lock is an irq lock vmsplice: relax alignement requirements for SPLICE_F_GIFT blkcg: use radix tree to index blkgs from blkcg blkcg: fix blkcg->css ref leak in __blkg_lookup_create() block: fix elvpriv allocation failure handling block: collapse blk_alloc_request() into get_request() blkcg: collapse blkcg_policy_ops into blkcg_policy blkcg: embed struct blkg_policy_data in policy specific data blkcg: mass rename of blkcg API blkcg: style cleanups for blk-cgroup.h blkcg: remove blkio_group->path[] blkcg: blkg_rwstat_read() was missing inline blkcg: shoot down blkgs if all policies are deactivated blkcg: drop stuff unused after per-queue policy activation update blkcg: implement per-queue policy activation blkcg: add request_queue->root_blkg blkcg: make request_queue bypassing on allocation blkcg: make sure blkg_lookup() returns %NULL if @q is bypassing blkcg: make blkg_conf_prep() take @pol and return with queue lock held blkcg: remove static policy ID enums ...	2012-05-30 08:52:42 -07:00
Linus Torvalds	c7523a7c88	Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer updates from Thomas Gleixner. Various trivial conflict fixups in arch Kconfig due to addition of unrelated entries nearby. And one slightly more subtle one for sparc32 (new user of GENERIC_CLOCKEVENTS), fixed up as per Thomas. * 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (31 commits) timekeeping: Fix a few minor newline issues. time: remove obsolete declaration ntp: Fix a stale comment and a few stray newlines. ntp: Correct TAI offset during leap second timers: Fixup the Kconfig consolidation fallout x86: Use generic time config unicore32: Use generic time config um: Use generic time config tile: Use generic time config sparc: Use: generic time config sh: Use generic time config score: Use generic time config s390: Use generic time config openrisc: Use generic time config powerpc: Use generic time config mn10300: Use generic time config mips: Use generic time config microblaze: Use generic time config m68k: Use generic time config m32r: Use generic time config ...	2012-05-24 13:29:46 -07:00
Linus Torvalds	644473e9c6	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull user namespace enhancements from Eric Biederman: "This is a course correction for the user namespace, so that we can reach an inexpensive, maintainable, and reasonably complete implementation. Highlights: - Config guards make it impossible to enable the user namespace and code that has not been converted to be user namespace safe. - Use of the new kuid_t type ensures the if you somehow get past the config guards the kernel will encounter type errors if you enable user namespaces and attempt to compile in code whose permission checks have not been updated to be user namespace safe. - All uids from child user namespaces are mapped into the initial user namespace before they are processed. Removing the need to add an additional check to see if the user namespace of the compared uids remains the same. - With the user namespaces compiled out the performance is as good or better than it is today. - For most operations absolutely nothing changes performance or operationally with the user namespace enabled. - The worst case performance I could come up with was timing 1 billion cache cold stat operations with the user namespace code enabled. This went from 156s to 164s on my laptop (or 156ns to 164ns per stat operation). - (uid_t)-1 and (gid_t)-1 are reserved as an internal error value. Most uid/gid setting system calls treat these value specially anyway so attempting to use -1 as a uid would likely cause entertaining failures in userspace. - If setuid is called with a uid that can not be mapped setuid fails. I have looked at sendmail, login, ssh and every other program I could think of that would call setuid and they all check for and handle the case where setuid fails. - If stat or a similar system call is called from a context in which we can not map a uid we lie and return overflowuid. The LFS experience suggests not lying and returning an error code might be better, but the historical precedent with uids is different and I can not think of anything that would break by lying about a uid we can't map. - Capabilities are localized to the current user namespace making it safe to give the initial user in a user namespace all capabilities. My git tree covers all of the modifications needed to convert the core kernel and enough changes to make a system bootable to runlevel 1." Fix up trivial conflicts due to nearby independent changes in fs/stat.c * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (46 commits) userns: Silence silly gcc warning. cred: use correct cred accessor with regards to rcu read lock userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq userns: Convert cgroup permission checks to use uid_eq userns: Convert tmpfs to use kuid and kgid where appropriate userns: Convert sysfs to use kgid/kuid where appropriate userns: Convert sysctl permission checks to use kuid and kgids. userns: Convert proc to use kuid/kgid where appropriate userns: Convert ext4 to user kuid/kgid where appropriate userns: Convert ext3 to use kuid/kgid where appropriate userns: Convert ext2 to use kuid/kgid where appropriate. userns: Convert devpts to use kuid/kgid where appropriate userns: Convert binary formats to use kuid/kgid where appropriate userns: Add negative depends on entries to avoid building code that is userns unsafe userns: signal remove unnecessary map_cred_ns userns: Teach inode_capable to understand inodes whose uids map to other namespaces. userns: Fail exec for suid and sgid binaries with ids outside our user namespace. userns: Convert stat to return values mapped from kuids and kgids userns: Convert user specfied uids and gids in chown into kuids and kgid userns: Use uid_eq gid_eq helpers when comparing kuids and kgids in the vfs ...	2012-05-23 17:42:39 -07:00
Linus Torvalds	269af9a1a0	Merge branch 'x86-extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull exception table generation updates from Ingo Molnar: "The biggest change here is to allow the build-time sorting of the exception table, to speed up booting. This is achieved by the architecture enabling BUILDTIME_EXTABLE_SORT. This option is enabled for x86 and MIPS currently. On x86 a number of fixes and changes were needed to allow build-time sorting of the exception table, in particular a relocation invariant exception table format was needed. This required the abstracting out of exception table protocol and the removal of 20 years of accumulated assumptions about the x86 exception table format. While at it, this tree also cleans up various other aspects of exception handling, such as early(er) exception handling for rdmsr_safe() et al. All in one, as the result of these changes the x86 exception code is now pretty nice and modern. As an added bonus any regressions in this code will be early and violent crashes, so if you see any of those, you'll know whom to blame!" Fix up trivial conflicts in arch/{mips,x86}/Kconfig files due to nearby modifications of other core architecture options. * 'x86-extable-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (35 commits) Revert "x86, extable: Disable presorted exception table for now" scripts/sortextable: Handle relative entries, and other cleanups x86, extable: Switch to relative exception table entries x86, extable: Disable presorted exception table for now x86, extable: Add _ASM_EXTABLE_EX() macro x86, extable: Remove open-coded exception table entries in arch/x86/ia32/ia32entry.S x86, extable: Remove open-coded exception table entries in arch/x86/include/asm/xsave.h x86, extable: Remove open-coded exception table entries in arch/x86/include/asm/kvm_host.h x86, extable: Remove the now-unused __ASM_EX_SEC macros x86, extable: Remove open-coded exception table entries in arch/x86/xen/xen-asm_32.S x86, extable: Remove open-coded exception table entries in arch/x86/um/checksum_32.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/usercopy_32.c x86, extable: Remove open-coded exception table entries in arch/x86/lib/putuser.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/getuser.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/csum-copy_64.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_nocache_64.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/copy_user_64.S x86, extable: Remove open-coded exception table entries in arch/x86/lib/checksum_32.S x86, extable: Remove open-coded exception table entries in arch/x86/kernel/test_rodata.c x86, extable: Remove open-coded exception table entries in arch/x86/kernel/entry_64.S ...	2012-05-23 10:44:35 -07:00
Linus Torvalds	2ff2b289a6	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf changes from Ingo Molnar: "Lots of changes: - (much) improved assembly annotation support in perf report, with jump visualization, searching, navigation, visual output improvements and more. - kernel support for AMD IBS PMU hardware features. Notably 'perf record -e cycles:p' and 'perf top -e cycles:p' should work without skid now, like PEBS does on the Intel side, because it takes advantage of IBS transparently. - the libtracevents library: it is the first step towards unifying tracing tooling and perf, and it also gives a tracing library for external tools like powertop to rely on. - infrastructure: various improvements and refactoring of the UI modules and related code - infrastructure: cleanup and simplification of the profiling targets code (--uid, --pid, --tid, --cpu, --all-cpus, etc.) - tons of robustness fixes all around - various ftrace updates: speedups, cleanups, robustness improvements. - typing 'make' in tools/ will now give you a menu of projects to build and a short help text to explain what each does. - ... and lots of other changes I forgot to list. The perf record make bzImage + perf report regression you reported should be fixed." * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (166 commits) tracing: Remove kernel_lock annotations tracing: Fix initial buffer_size_kb state ring-buffer: Merge separate resize loops perf evsel: Create events initially disabled -- again perf tools: Split term type into value type and term type perf hists: Fix callchain ip printf format perf target: Add uses_mmap field ftrace: Remove selecting FRAME_POINTER with FUNCTION_TRACER ftrace/x86: Have x86 ftrace use the ftrace_modify_all_code() ftrace: Make ftrace_modify_all_code() global for archs to use ftrace: Return record ip addr for ftrace_location() ftrace: Consolidate ftrace_location() and ftrace_text_reserved() ftrace: Speed up search by skipping pages by address ftrace: Remove extra helper functions ftrace: Sort all function addresses, not just per page tracing: change CPU ring buffer state from tracing_cpumask tracing: Check return value of tracing_dentry_percpu() ring-buffer: Reset head page before running self test ring-buffer: Add integrity check at end of iter read ring-buffer: Make addition of pages in ring buffer atomic ...	2012-05-22 18:18:55 -07:00
Linus Torvalds	5d4e2d08e7	Driver core pull for 3.5-rc1 Here's the driver core, and other driver subsystems, pull request for the 3.5-rc1 merge window. Outside of a few minor driver core changes, we ended up with the following different subsystem and core changes as well, due to interdependancies on the driver core: - hyperv driver updates - drivers/memory being created and some drivers moved into it - extcon driver subsystem created out of the old Android staging switch driver code - dynamic debug updates - printk rework, and /dev/kmsg changes All of this has been tested in the linux-next releases for a few weeks with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iEYEABECAAYFAk+7q28ACgkQMUfUDdst+ykXmwCfcPASzC+/bDkuqdWsqzxlWZ7+ VOQAnAriySv397St36J6Hz5bMQZwB1Yq =SQc+ -----END PGP SIGNATURE----- Merge tag 'driver-core-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg Kroah-Hartman: "Here's the driver core, and other driver subsystems, pull request for the 3.5-rc1 merge window. Outside of a few minor driver core changes, we ended up with the following different subsystem and core changes as well, due to interdependancies on the driver core: - hyperv driver updates - drivers/memory being created and some drivers moved into it - extcon driver subsystem created out of the old Android staging switch driver code - dynamic debug updates - printk rework, and /dev/kmsg changes All of this has been tested in the linux-next releases for a few weeks with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>" Fix up conflicts in drivers/extcon/extcon-max8997.c where git noticed that a patch to the deleted drivers/misc/max8997-muic.c driver needs to be applied to this one. * tag 'driver-core-3.5-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (90 commits) uio_pdrv_genirq: get irq through platform resource if not set otherwise memory: tegra{20,30}-mc: Remove empty _remove() printk() - isolate KERN_CONT users from ordinary complete lines sysfs: get rid of some lockdep false positives Drivers: hv: util: Properly handle version negotiations. Drivers: hv: Get rid of an unnecessary check in vmbus_prep_negotiate_resp() memory: tegra{20,30}-mc: Use dev_err_ratelimited() driver core: Add dev__ratelimited() family Driver Core: don't oops with unregistered driver in driver_find_device() printk() - restore prefix/timestamp printing for multi-newline strings printk: add stub for prepend_timestamp() ARM: tegra30: Make MC optional in Kconfig ARM: tegra20: Make MC optional in Kconfig ARM: tegra30: MC: Remove unnecessary BUG() ARM: tegra20: MC: Remove unnecessary BUG() printk: correctly align __log_buf ARM: tegra30: Add Tegra Memory Controller(MC) driver ARM: tegra20: Add Tegra Memory Controller(MC) driver printk() - restore timestamp printing at console output printk() - do not merge continuation lines of different threads ...	2012-05-22 16:02:13 -07:00
Linus Torvalds	bf67f3a5c4	Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull smp hotplug cleanups from Thomas Gleixner: "This series is merily a cleanup of code copied around in arch/* and not changing any of the real cpu hotplug horrors yet. I wish I'd had something more substantial for 3.5, but I underestimated the lurking horror..." Fix up trivial conflicts in arch/{arm,sparc,x86}/Kconfig and arch/sparc/include/asm/thread_info_32.h * 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (79 commits) um: Remove leftover declaration of alloc_task_struct_node() task_allocator: Use config switches instead of magic defines sparc: Use common threadinfo allocator score: Use common threadinfo allocator sh-use-common-threadinfo-allocator mn10300: Use common threadinfo allocator powerpc: Use common threadinfo allocator mips: Use common threadinfo allocator hexagon: Use common threadinfo allocator m32r: Use common threadinfo allocator frv: Use common threadinfo allocator cris: Use common threadinfo allocator x86: Use common threadinfo allocator c6x: Use common threadinfo allocator fork: Provide kmemcache based thread_info allocator tile: Use common threadinfo allocator fork: Provide weak arch_release_[task_struct\|thread_info] functions fork: Move thread info gfp flags to header fork: Remove the weak insanity sh: Remove cpu_idle_wait() ...	2012-05-21 19:43:57 -07:00
Linus Torvalds	226da0dbc8	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU changes from Ingo Molnar: "This is the v3.5 RCU tree from Paul E. McKenney: 1) A set of improvements and fixes to the RCU_FAST_NO_HZ feature (with more on the way for 3.6). Posted to LKML: https://lkml.org/lkml/2012/4/23/324 (commits 1-3 and 5), https://lkml.org/lkml/2012/4/16/611 (commit 4), https://lkml.org/lkml/2012/4/30/390 (commit 6), and https://lkml.org/lkml/2012/5/4/410 (commit 7, combined with the other commits for the convenience of the tester). 2) Changes to make rcu_barrier() avoid disrupting execution of CPUs that have no RCU callbacks. Posted to LKML: https://lkml.org/lkml/2012/4/23/322. 3) A couple of commits that improve the efficiency of the interaction between preemptible RCU and the scheduler, these two being all that survived an abortive attempt to allow preemptible RCU's __rcu_read_lock() to be inlined. The full set was posted to LKML at https://lkml.org/lkml/2012/4/14/143, and the first and third patches of that set remain. 4) Lai Jiangshan's algorithmic implementation of SRCU, which includes call_srcu() and srcu_barrier(). A major feature of this new implementation is that synchronize_srcu() no longer disturbs the execution of other CPUs. This work is based on earlier implementations by Peter Zijlstra and Paul E. McKenney. Posted to LKML: https://lkml.org/lkml/2012/2/22/82. 5) A number of miscellaneous bug fixes and improvements which were posted to LKML at: https://lkml.org/lkml/2012/4/23/353 with subsequent updates posted to LKML." * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits) rcu: Make rcu_barrier() less disruptive rcu: Explicitly initialize RCU_FAST_NO_HZ per-CPU variables rcu: Make RCU_FAST_NO_HZ handle timer migration rcu: Update RCU maintainership rcu: Make exit_rcu() more precise and consolidate rcu: Move PREEMPT_RCU preemption to switch_to() invocation rcu: Ensure that RCU_FAST_NO_HZ timers expire on correct CPU rcu: Add rcutorture test for call_srcu() rcu: Implement per-domain single-threaded call_srcu() state machine rcu: Use single value to handle expedited SRCU grace periods rcu: Improve srcu_readers_active_idx()'s cache locality rcu: Remove unused srcu_barrier() rcu: Implement a variant of Peter's SRCU algorithm rcu: Improve SRCU's wait_idx() comments rcu: Flip ->completed only once per SRCU grace period rcu: Increment upper bit only for srcu_read_lock() rcu: Remove fast check path from __synchronize_srcu() rcu: Direct algorithmic SRCU implementation rcu: Introduce rcutorture testing for rcu_barrier() timer: Fix mod_timer_pinned() header comment ...	2012-05-21 19:26:51 -07:00
Thomas Gleixner	764e0da14f	timers: Fixup the Kconfig consolidation fallout Sigh, I missed to check which architecture Kconfig files actually include the core Kconfig file. There are a few which did not. So we broke them. Instead of adding the includes to those, we are better off to move the include to init/Kconfig like we did already with irqs and others. This does not change anything for the architectures using the old style periodic timer mode. It just solves the build wreckage there. For those architectures which use the clock events infrastructure it moves the include of the core Kconfig file to "General setup" which is a way more logical place than having it at random locations specified by the architecture specific Kconfigs. Reported-by: Ingo Molnar <mingo@kernel.org> Cc: Anna-Maria Gleixner <anna-maria@glx-um.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-21 23:43:46 +02:00
Linus Torvalds	31a67102f4	Fix blocking allocations called very early during bootup During early boot, when the scheduler hasn't really been fully set up, we really can't do blocking allocations because with certain (dubious) configurations the "might_resched()" calls can actually result in scheduling events. We could just make such users always use GFP_ATOMIC, but quite often the code that does the allocation isn't really aware of the fact that the scheduler isn't up yet, and forcing that kind of random knowledge on the initialization code is just annoying and not good for anybody. And we actually have a the 'gfp_allowed_mask' exactly for this reason: it's just that the kernel init sequence happens to set it to allow blocking allocations much too early. So move the 'gfp_allowed_mask' initialization from 'start_kernel()' (which is some of the earliest init code, and runs with preemption disabled for good reasons) into 'kernel_init()'. kernel_init() is run in the newly created thread that will become the 'init' process, as opposed to the early startup code that runs within the context of what will be the first idle thread. So by the time we reach 'kernel_init()', we know that the scheduler must be at least limping along, because we've already scheduled from the idle thread into the init thread. Reported-by: Steven Rostedt <rostedt@goodmis.org> Cc: David Rientjes <rientjes@google.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-21 12:52:42 -07:00
Arnaldo Carvalho de Melo	16ee6576e2	Merge remote-tracking branch 'tip/perf/urgent' into perf/core Merge reason: We are going to queue up a dependent patch: "perf tools: Move parse event automated tests to separated object" That depends on: commit `e7c72d8` perf tools: Add 'G' and 'H' modifiers to event parsing Conflicts: tools/perf/builtin-stat.c Conflicted with the recent 'perf_target' patches when checking the result of perf_evsel open routines to see if a retry is needed to cope with older kernels where the exclude guest/host perf_event_attr bits were not used. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2012-05-18 13:13:33 -03:00
Eric W. Biederman	b38a86eb19	userns: Convert the move_pages, and migrate_pages permission checks to use uid_eq Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:30 -07:00
Eric W. Biederman	14a590c3f9	userns: Convert cgroup permission checks to use uid_eq Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:30 -07:00
Eric W. Biederman	8751e03958	userns: Convert tmpfs to use kuid and kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:29 -07:00
Eric W. Biederman	ab27b91b9f	userns: Convert sysfs to use kgid/kuid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:29 -07:00
Eric W. Biederman	091bd3ea4e	userns: Convert sysctl permission checks to use kuid and kgids. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:28 -07:00
Eric W. Biederman	dcb0f22282	userns: Convert proc to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:28 -07:00
Eric W. Biederman	08cefc7ab8	userns: Convert ext4 to user kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:27 -07:00
Eric W. Biederman	1523299d58	userns: Convert ext3 to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:27 -07:00
Eric W. Biederman	b8a9f9e183	userns: Convert ext2 to use kuid/kgid where appropriate. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:26 -07:00
Eric W. Biederman	f04c6ce2cf	userns: Convert devpts to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:26 -07:00
Eric W. Biederman	ebc887b278	userns: Convert binary formats to use kuid/kgid where appropriate Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:25 -07:00
Eric W. Biederman	e1c972b681	userns: Add negative depends on entries to avoid building code that is userns unsafe Add a new internal Kconfig option UIDGID_CONVERTED that is true when the selected Kconfig options have been converted to be user namespace safe, and guard USER_NS and guard the UIDGID_STRICT_TYPE_CHECK options with it. This keeps innocent kernel users from having the choice to enable the user namespace in the cases where it is known not to work. Most of the rest of the conversions are simple and straight forward but their sheer number means it is good not to count on having them all done and reviwed before thinking of merging this code. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-05-15 14:59:25 -07:00
Ingo Molnar	2d84e023cb	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu Pull the v3.5 RCU tree from Paul E. McKenney: 1) A set of improvements and fixes to the RCU_FAST_NO_HZ feature (with more on the way for 3.6). Posted to LKML: https://lkml.org/lkml/2012/4/23/324 (commits 1-3 and 5), https://lkml.org/lkml/2012/4/16/611 (commit 4), https://lkml.org/lkml/2012/4/30/390 (commit 6), and https://lkml.org/lkml/2012/5/4/410 (commit 7, combined with the other commits for the convenience of the tester). 2) Changes to make rcu_barrier() avoid disrupting execution of CPUs that have no RCU callbacks. Posted to LKML: https://lkml.org/lkml/2012/4/23/322. 3) A couple of commits that improve the efficiency of the interaction between preemptible RCU and the scheduler, these two being all that survived an abortive attempt to allow preemptible RCU's __rcu_read_lock() to be inlined. The full set was posted to LKML at https://lkml.org/lkml/2012/4/14/143, and the first and third patches of that set remain. 4) Lai Jiangshan's algorithmic implementation of SRCU, which includes call_srcu() and srcu_barrier(). A major feature of this new implementation is that synchronize_srcu() no longer disturbs the execution of other CPUs. This work is based on earlier implementations by Peter Zijlstra and Paul E. McKenney. Posted to LKML: https://lkml.org/lkml/2012/2/22/82. 5) A number of miscellaneous bug fixes and improvements which were posted to LKML at: https://lkml.org/lkml/2012/4/23/353 with subsequent updates posted to LKML. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-05-14 08:41:46 +02:00
Thomas Gleixner	67ba5293f7	Merge branch 'smp/threadalloc' into smp/hotplug Reason: Pull in the separate branch which was created so arch/tile can base further work on it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2012-05-08 14:07:48 +02:00
Sasha Levin	377485f624	init: don't try mounting device as nfs root unless type fully matches Currently, we'll try mounting any device who's major device number is UNNAMED_MAJOR as NFS root. This would happen for non-NFS devices as well (such as 9p devices) but it wouldn't cause any issues since mounting the device as NFS would fail quickly and the code proceeded to doing the proper mount: [ 101.522716] VFS: Unable to mount root fs via NFS, trying floppy. [ 101.534499] VFS: Mounted root (9p filesystem) on device 0:18. Commit 6829a048102a ("NFS: Retry mounting NFSROOT") introduced retries when mounting NFS root, which means that now we don't immediately fail and instead it takes an additional 90+ seconds until we stop retrying, which has revealed the issue this patch fixes. This meant that it would take an additional 90 seconds to boot when we're not using a device type which gets detected in order before NFS. This patch modifies the NFS type check to require device type to be 'Root_NFS' instead of requiring the device to have an UNNAMED_MAJOR major. This makes boot process cleaner since we now won't go through the NFS mounting code at all when the device isn't an NFS root ("/dev/nfs"). Signed-off-by: Sasha Levin <levinsasha928@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-05-05 10:04:40 -07:00
Thomas Gleixner	a6359d1eec	init_task: Replace CONFIG_HAVE_GENERIC_INIT_TASK Now that all archs except ia64 are converted, replace the config and let the ia64 select CONFIG_ARCH_INIT_TASK Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20120503085035.867948914@linutronix.de	2012-05-05 13:00:46 +02:00
Thomas Gleixner	a4a2eb490e	init_task: Create generic init_task instance All archs define init_task in the same way (except ia64, but there is no particular reason why ia64 cannot use the common version). Create a generic instance so all archs can be converted over. The config switch is temporary and will be removed when all archs are converted over. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chen Liqin <liqin.chen@sunplusct.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Chris Zankel <chris@zankel.net> Cc: David Howells <dhowells@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guan Xuetao <gxt@mprc.pku.edu.cn> Cc: Haavard Skinnemoen <hskinnemoen@gmail.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: James E.J. Bottomley <jejb@parisc-linux.org> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Mark Salter <msalter@redhat.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Kuo <rkuo@codeaurora.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@arm.linux.org.uk> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Link: http://lkml.kernel.org/r/20120503085034.092585287@linutronix.de	2012-05-05 13:00:21 +02:00
Greg Kroah-Hartman	eb1574270a	Merge 3.4-rc5 into driver-core-next This was done to resolve a merge issue with the init/main.c file. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-05-02 14:33:37 -07:00
Jens Axboe	0b7877d4ee	Linux 3.4-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) iQEcBAABAgAGBQJPnb50AAoJEHm+PkMAQRiGAE0H/A4zFZIUGmF3miKPDYmejmrZ oVDYxVAu6JHjHWhu8E3VsinvyVscowjV8dr15eSaQzmDmRkSHAnUQ+dB7Di7jLC2 MNopxsWjwyZ8zvvr3rFR76kjbWKk/1GYytnf7GPZLbJQzd51om2V/TY/6qkwiDSX U8Tt7ihSgHAezefqEmWp2X/1pxDCEt+VFyn9vWpkhgdfM1iuzF39MbxSZAgqDQ/9 JJrBHFXhArqJguhENwL7OdDzkYqkdzlGtS0xgeY7qio2CzSXxZXK4svT6FFGA8Za xlAaIvzslDniv3vR2ZKd6wzUwFHuynX222hNim3QMaYdXm012M+Nn1ufKYGFxI0= =4d4w -----END PGP SIGNATURE----- Merge tag 'v3.4-rc5' into for-3.5/core The core branch is behind driver commits that we want to build on for 3.5, hence I'm pulling in a later -rc. Linux 3.4-rc5 Conflicts: Documentation/feature-removal-schedule.txt Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-05-01 14:29:55 +02:00
Jim Cromie	9fb48c744b	params: add 3rd arg to option handler callback signature Add a 3rd arg, named "doing", to unknown-options callbacks invoked from parse_args(). The arg is passed as: "Booting kernel" from start_kernel(), initcall_level_names[i] from do_initcall_level(), mod->name from load_module(), via parse_args(), parse_one() parse_args() already has the "name" parameter, which is renamed to "doing" to better reflect current uses 1,2 above. parse_args() passes it to an altered parse_one(), which now passes it down into the unknown option handler callbacks. The mod->name will be needed to handle dyndbg for loadable modules, since params passed by modprobe are not qualified (they do not have a "$modname." prefix), and by the time the unknown-param callback is called, the module name is not otherwise available. Minor tweaks: Add param-name to parse_one's pr_debug(), current message doesnt identify the param being handled, add it. Add a pr_info to print current level and level_name of the initcall, and number of registered initcalls at that level. This adds 7 lines to dmesg output, like: initlevel:6=device, 172 registered initcalls Drop "parameters" from initcall_level_names[], its unhelpful in the pr_info() added above. This array is passed into parse_args() by do_initcall_level(). CC: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: Jason Baron <jbaron@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2012-04-30 14:05:27 -04:00
Robert Richter	392d65a9ad	perf: Remove PERF_COUNTERS config option Renaming remaining PERF_COUNTERS options into PERF_EVENTS. Think we can get rid of PERF_COUNTERS now. Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1333643084-26776-5-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2012-04-26 13:52:52 +02:00
Chris Metcalf	a99cd11251	init: fix bug where environment vars can't be passed via boot args Commit `026cee0086` had the side-effect of dropping the '=' from the unknown boot arguments that are passed to init as environment variables. This is because parse_args() puts a NUL in the string where the '=' was when it passes the "param" and "val" pointers to the parsing subfunctions. Previously, unknown_bootoption() was the last parse_args() subfunction to run, and it carefully put back the '=' character. Now the ignore_unknown_bootoption() is the last one to run, and it wasn't doing the necessary repair, so the envp params ended up with the embedded NUL and were no longer seen as valid environment variables by init. Tested-by: Woody Suwalski <terraluna977@gmail.com> Acked-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-25 11:47:29 -04:00
Paul E. McKenney	8932a63d5e	rcu: Reduce cache-miss initialization latencies for large systems Commit #0209f649 (rcu: limit rcu_node leaf-level fanout) set an upper limit of 16 on the leaf-level fanout for the rcu_node tree. This was needed to reduce lock contention that was induced by the synchronization of scheduling-clock interrupts, which was in turn needed to improve energy efficiency for moderate-sized lightly loaded servers. However, reducing the leaf-level fanout means that there are more leaf-level rcu_node structures in the tree, which in turn means that RCU's grace-period initialization incurs more cache misses. This is not a problem on moderate-sized servers with only a few tens of CPUs, but becomes a major source of real-time latency spikes on systems with many hundreds of CPUs. In addition, the workloads running on these large systems tend to be CPU-bound, which eliminates the energy-efficiency advantages of synchronizing scheduling-clock interrupts. Therefore, these systems need maximal values for the rcu_node leaf-level fanout. This commit addresses this problem by introducing a new kernel parameter named RCU_FANOUT_LEAF that directly controls the leaf-level fanout. This parameter defaults to 16 to handle the common case of a moderate sized lightly loaded servers, but may be set higher on larger systems. Reported-by: Mike Galbraith <efault@gmx.de> Reported-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:52 -07:00
Paul E. McKenney	c9336643e1	rcu: Clarify help text for RCU_BOOST_PRIO The old text confused real-time applications with real-time threads, so that you pretty much needed to understand how this kernel configuration parameter worked to understand the help text. This commit therefore attempts to make the help text human-readable. Reported-by: Jörn Engel <joern@purestorage.com> Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-04-24 20:54:50 -07:00
David Daney	1dbdc6f177	kbuild/extable: Hook up sortextable into the build system. Define a config variable BUILDTIME_EXTABLE_SORT to control build time sorting of the kernel's exception table. Patch Makefile to do the sorting when BUILDTIME_EXTABLE_SORT is selected. Signed-off-by: David Daney <david.daney@cavium.com> Link: http://lkml.kernel.org/r/1334872799-14589-4-git-send-email-ddaney.cavm@gmail.com Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>	2012-04-19 15:06:56 -07:00
Eric W. Biederman	5673a94c14	userns: Add a Kconfig option to enforce strict kuid and kgid type checks Make it possible to easily switch between strong mandatory type checks and relaxed type checks so that the code can easily be tested with the type checks and then built with the strong type checks disabled so the resulting code can be used. Require strong mandatory type checks when enabling the user namespace. It is very simple to make a typo and use the wrong type allowing conversions to/from userspace values to be bypassed by accident, the strong type checks prevent this. Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>	2012-04-07 17:11:01 -07:00
Linus Torvalds	deb74f5ca1	Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPc+5PAAoJENkgDmzRrbjx8qwQAIRGDWGAJ7fiu8QBVbjycXJG 7828enxrbBQodNmc+uAkYvTv3KEoi8tlweMsk/lWDv8WovZV4IlQDEFCX/f4hWVY S+2PmqJkN/alsG3dXd00zotK9mOJD+mQPAdjUBaNnRdp3QoV3YrjgihkWiL23DyT dZTgqXdbUJkHk/d9YD1qcDvWdSr1EufSLYa52PhLJqYiYVk8zCdX82deJX1MWh64 v9I6htA73ORoX4JBGsFAOHO8fmLaq1yhBUMHOL4+gfEJVv4kSTU05GgepBHQP1fm BbG2hN6G4vqqiqhV5A59+h271o/2d/KBGKx8/twRGk8tNJIwTIVnr/qcGuUfytC3 vA1fmq3vul0bzbqRgph8bGJyoVIg8CHjq24BFJQOXiQ1/6HOvjxnKBYs+3sVA829 ZYQYuEoRKmTsD3vv3nmcqAdZZDzehBQ499bEqDNsnQRLOjOVNag/pJSaENkeVC4T CKYXt9BEabYnermPLdrjiabPE27GaEznX11SzCSXiWJsKX2kJnvz5RxVo8nlh1fc /KQWJyWi/QVmAdy4eCJFp48513BqncHvKtPZ6zN9+Y6NHKmnmAqieZhh4yV/SCqi EcK2oHQXmioKldn5DANQjeUCWlmEYXHbR08ahGRLNc7GZ1qKCgDr8+WEC0XYB/gQ XLH3KKLM+VmvtonqjDV7 =W59/ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://github.com/rustyrussell/linux Pull cpumask cleanups from Rusty Russell: "(Somehow forgot to send this out; it's been sitting in linux-next, and if you don't want it, it can sit there another cycle)" I'm a sucker for things that actually delete lines of code. Fix up trivial conflict in arch/arm/kernel/kprobes.c, where Rusty fixed a user of &cpu_online_map to be cpu_online_mask, but that code got deleted by commit `b21d55e98a` ("ARM: 7332/1: extract out code patch function from kprobes"). * tag 'for-linus' of git://github.com/rustyrussell/linux: cpumask: remove old cpu__map. documentation: remove references to cpu__map. drivers/cpufreq/db8500-cpufreq: remove references to cpu__map. remove references to cpu__map in arch/	2012-04-02 08:53:24 -07:00
Tejun Heo	959d851caa	Merge branch 'for-3.5' of ../cgroup into block/for-3.5/core-merged cgroup/for-3.5 contains the following changes which blk-cgroup needs to proceed with the on-going cleanup. * Dynamic addition and removal of cftypes to make config/stat file handling modular for policies. * cgroup removal update to not wait for css references to drain to fix blkcg removal hang caused by cfq caching cfqgs. Pull in cgroup/for-3.5 into block/for-3.5/core. This causes the following conflicts in block/blk-cgroup.c. * `761b3ef50e` "cgroup: remove cgroup_subsys argument from callbacks" conflicts with blkiocg_pre_destroy() addition and blkiocg_attach() removal. Resolved by removing @subsys from all subsys methods. * `676f7c8f84` "cgroup: relocate cftype and cgroup_subsys definitions in controllers" conflicts with ->pre_destroy() and ->attach() updates and removal of modular config. Resolved by dropping forward declarations of the methods and applying updates to the relocated blkio_subsys. * `4baf6e3325` "cgroup: convert all non-memcg controllers to the new cftype interface" builds upon the previous item. Resolved by adding ->base_cftypes to the relocated blkio_subsys. Signed-off-by: Tejun Heo <tj@kernel.org>	2012-04-01 12:55:00 -07:00
Al Viro	39429c5e4a	new helper: ext2_image_size() ... implemented that way since the next commit will leave it almost alone in ext2_fs.h - most of the file (including struct ext2_super_block) is going to move to fs/ext2/ext2.h. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-03-31 16:03:16 -04:00
Al Viro	2f99c36986	get rid of pointless includes of ext2_fs.h Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-03-31 16:03:15 -04:00
Rusty Russell	5f054e31c6	documentation: remove references to cpu_*_map. This has been obsolescent for a while, fix documentation and misc comments. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-03-29 15:38:31 +10:30
Linus Torvalds	0195c00244	Disintegrate and delete asm/system.h -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (GNU/Linux) iQIVAwUAT3NKzROxKuMESys7AQKElw/+JyDxJSlj+g+nymkx8IVVuU8CsEwNLgRk 8KEnRfLhGtkXFLSJYWO6jzGo16F8Uqli1PdMFte/wagSv0285/HZaKlkkBVHdJ/m u40oSjgT013bBh6MQ0Oaf8pFezFUiQB5zPOA9QGaLVGDLXCmgqUgd7exaD5wRIwB ZmyItjZeAVnDfk1R+ZiNYytHAi8A5wSB+eFDCIQYgyulA1Igd1UnRtx+dRKbvc/m rWQ6KWbZHIdvP1ksd8wHHkrlUD2pEeJ8glJLsZUhMm/5oMf/8RmOCvmo8rvE/qwl eDQ1h4cGYlfjobxXZMHqAN9m7Jg2bI946HZjdb7/7oCeO6VW3FwPZ/Ic75p+wp45 HXJTItufERYk6QxShiOKvA+QexnYwY0IT5oRP4DrhdVB/X9cl2MoaZHC+RbYLQy+ /5VNZKi38iK4F9AbFamS7kd0i5QszA/ZzEzKZ6VMuOp3W/fagpn4ZJT1LIA3m4A9 Q0cj24mqeyCfjysu0TMbPtaN+Yjeu1o1OFRvM8XffbZsp5bNzuTDEvviJ2NXw4vK 4qUHulhYSEWcu9YgAZXvEWDEM78FXCkg2v/CrZXH5tyc95kUkMPcgG+QZBB5wElR FaOKpiC/BuNIGEf02IZQ4nfDxE90QwnDeoYeV+FvNj9UEOopJ5z5bMPoTHxm4cCD NypQthI85pc= =G9mT -----END PGP SIGNATURE----- Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system Pull "Disintegrate and delete asm/system.h" from David Howells: "Here are a bunch of patches to disintegrate asm/system.h into a set of separate bits to relieve the problem of circular inclusion dependencies. I've built all the working defconfigs from all the arches that I can and made sure that they don't break. The reason for these patches is that I recently encountered a circular dependency problem that came about when I produced some patches to optimise get_order() by rewriting it to use ilog2(). This uses bitops - and on the SH arch asm/bitops.h drags in asm-generic/get_order.h by a circuituous route involving asm/system.h. The main difficulty seems to be asm/system.h. It holds a number of low level bits with no/few dependencies that are commonly used (eg. memory barriers) and a number of bits with more dependencies that aren't used in many places (eg. switch_to()). These patches break asm/system.h up into the following core pieces: (1) asm/barrier.h Move memory barriers here. This already done for MIPS and Alpha. (2) asm/switch_to.h Move switch_to() and related stuff here. (3) asm/exec.h Move arch_align_stack() here. Other process execution related bits could perhaps go here from asm/processor.h. (4) asm/cmpxchg.h Move xchg() and cmpxchg() here as they're full word atomic ops and frequently used by atomic_xchg() and atomic_cmpxchg(). (5) asm/bug.h Move die() and related bits. (6) asm/auxvec.h Move AT_VECTOR_SIZE_ARCH here. Other arch headers are created as needed on a per-arch basis." Fixed up some conflicts from other header file cleanups and moving code around that has happened in the meantime, so David's testing is somewhat weakened by that. We'll find out anything that got broken and fix it.. * tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits) Delete all instances of asm/system.h Remove all #inclusions of asm/system.h Add #includes needed to permit the removal of asm/system.h Move all declarations of free_initmem() to linux/mm.h Disintegrate asm/system.h for OpenRISC Split arch_align_stack() out from asm-generic/system.h Split the switch_to() wrapper out of asm-generic/system.h Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h Create asm-generic/barrier.h Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h Disintegrate asm/system.h for Xtensa Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt] Disintegrate asm/system.h for Tile Disintegrate asm/system.h for Sparc Disintegrate asm/system.h for SH Disintegrate asm/system.h for Score Disintegrate asm/system.h for S390 Disintegrate asm/system.h for PowerPC Disintegrate asm/system.h for PA-RISC Disintegrate asm/system.h for MN10300 ...	2012-03-28 15:58:21 -07:00
David Howells	49a7f04a4b	Move all declarations of free_initmem() to linux/mm.h Move all declarations of free_initmem() to linux/mm.h so that there's only one and it's used by everything. Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-c6x-dev@linux-c6x.org cc: microblaze-uclinux@itee.uq.edu.au cc: linux-sh@vger.kernel.org cc: sparclinux@vger.kernel.org cc: x86@kernel.org cc: linux-mm@kvack.org	2012-03-28 18:30:03 +01:00
Pawel Moll	026cee0086	params: <level>_initcall-like kernel parameters This patch adds a set of macros that can be used to declare kernel parameters to be parsed _before_ initcalls at a chosen level are executed. We rename the now-unused "flags" field of struct kernel_param as the level. It's signed, for when we use this for early params as well, in future. Linker macro collating init calls had to be modified in order to add additional symbols between levels that are later used by the init code to split the calls into blocks. Signed-off-by: Pawel Moll <pawel.moll@arm.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-03-26 12:50:51 +10:30
Bernhard Walle	0e0cb892a8	init/do_mounts.c: print error code on mount failure Printing the error code makes it easier to debug the cause of a mount failure. For example I had the problem that the root file system could not be mounted read-writeable because my SD card was write-protected. Without an error code it looks like the SD card was not detected at all. Signed-off-by: Bernhard Walle <bernhard@bwalle.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-23 16:58:38 -07:00
Diwakar Tundlam	8595c539f0	init: check printed flag to skip printing message Otherwise the 'Calibration skipped' message gets printed everytime a CPU is hotplugged in, cluttering console for systems that frequently hotplug CPUs. Signed-off-by: Diwakar Tundlam <dtundlam@nvidia.com> Cc: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Greg KH <greg@kroah.com> Cc: Sameer Nanda <snanda@chromium.org> Cc: Peter De Schrijver <pdeschrijver@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-03-23 16:58:38 -07:00
Linus Torvalds	5375871d43	Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull powerpc merge from Benjamin Herrenschmidt: "Here's the powerpc batch for this merge window. It is going to be a bit more nasty than usual as in touching things outside of arch/powerpc mostly due to the big iSeriesectomy :-) We finally got rid of the bugger (legacy iSeries support) which was a PITA to maintain and that nobody really used anymore. Here are some of the highlights: - Legacy iSeries is gone. Thanks Stephen ! There's still some bits and pieces remaining if you do a grep -ir series arch/powerpc but they are harmless and will be removed in the next few weeks hopefully. - The 'fadump' functionality (Firmware Assisted Dump) replaces the previous (equivalent) "pHyp assisted dump"... it's a rewrite of a mechanism to get the hypervisor to do crash dumps on pSeries, the new implementation hopefully being much more reliable. Thanks Mahesh Salgaonkar. - The "EEH" code (pSeries PCI error handling & recovery) got a big spring cleaning, motivated by the need to be able to implement a new backend for it on top of some new different type of firwmare. The work isn't complete yet, but a good chunk of the cleanups is there. Note that this adds a field to struct device_node which is not very nice and which Grant objects to. I will have a patch soon that moves that to a powerpc private data structure (hopefully before rc1) and we'll improve things further later on (hopefully getting rid of the need for that pointer completely). Thanks Gavin Shan. - I dug into our exception & interrupt handling code to improve the way we do lazy interrupt handling (and make it work properly with "edge" triggered interrupt sources), and while at it found & fixed a wagon of issues in those areas, including adding support for page fault retry & fatal signals on page faults. - Your usual random batch of small fixes & updates, including a bunch of new embedded boards, both Freescale and APM based ones, etc..." I fixed up some conflicts with the generalized irq-domain changes from Grant Likely, hopefully correctly. * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (141 commits) powerpc/ps3: Do not adjust the wrapper load address powerpc: Remove the rest of the legacy iSeries include files powerpc: Remove the remaining CONFIG_PPC_ISERIES pieces init: Remove CONFIG_PPC_ISERIES powerpc: Remove FW_FEATURE ISERIES from arch code tty/hvc_vio: FW_FEATURE_ISERIES is no longer selectable powerpc/spufs: Fix double unlocks powerpc/5200: convert mpc5200 to use of_platform_populate() powerpc/mpc5200: add options to mpc5200_defconfig powerpc/mpc52xx: add a4m072 board support powerpc/mpc5200: update mpc5200_defconfig to fit for charon board Documentation/powerpc/mpc52xx.txt: Checkpatch cleanup powerpc/44x: Add additional device support for APM821xx SoC and Bluestone board powerpc/44x: Add support PCI-E for APM821xx SoC and Bluestone board MAINTAINERS: Update PowerPC 4xx tree powerpc/44x: The bug fixed support for APM821xx SoC and Bluestone board powerpc: document the FSL MPIC message register binding powerpc: add support for MPIC message register API powerpc/fsl: Added aliased MSIIR register address to MSI node in dts powerpc/85xx: mpc8548cds - add 36-bit dts ...	2012-03-21 18:55:10 -07:00
Linus Torvalds	69a7aebcf0	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial Pull trivial tree from Jiri Kosina: "It's indeed trivial -- mostly documentation updates and a bunch of typo fixes from Masanari. There are also several linux/version.h include removals from Jesper." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (101 commits) kcore: fix spelling in read_kcore() comment constify struct pci_dev * in obvious cases Revert "char: Fix typo in viotape.c" init: fix wording error in mm_init comment usb: gadget: Kconfig: fix typo for 'different' Revert "power, max8998: Include linux/module.h just once in drivers/power/max8998_charger.c" writeback: fix fn name in writeback_inodes_sb_nr_if_idle() comment header writeback: fix typo in the writeback_control comment Documentation: Fix multiple typo in Documentation tpm_tis: fix tis_lock with respect to RCU Revert "media: Fix typo in mixer_drv.c and hdmi_drv.c" Doc: Update numastat.txt qla4xxx: Add missing spaces to error messages compiler.h: Fix typo security: struct security_operations kerneldoc fix Documentation: broken URL in libata.tmpl Documentation: broken URL in filesystems.tmpl mtd: simplify return logic in do_map_probe() mm: fix comment typo of truncate_inode_pages_range power: bq27x00: Fix typos in comment ...	2012-03-20 21:12:50 -07:00
Stephen Rothwell	bc58450b02	init: Remove CONFIG_PPC_ISERIES It is no longer selectable, so remove the check for it. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-03-21 11:16:12 +11:00
Linus Torvalds	2ba68940c8	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler changes for v3.4 from Ingo Molnar * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) printk: Make it compile with !CONFIG_PRINTK sched/x86: Fix overflow in cyc2ns_offset sched: Fix nohz load accounting -- again! sched: Update yield() docs printk/sched: Introduce special printk_sched() for those awkward moments sched/nohz: Correctly initialize 'next_balance' in 'nohz' idle balancer sched: Cleanup cpu_active madness sched: Fix load-balance wreckage sched: Clean up parameter passing of proc_sched_autogroup_set_nice() sched: Ditch per cgroup task lists for load-balancing sched: Rename load-balancing fields sched: Move load-balancing arguments into helper struct sched/rt: Do not submit new work when PI-blocked sched/rt: Prevent idle task boosting sched/wait: Add __wake_up_all_locked() API sched/rt: Document scheduler related skip-resched-check sites sched/rt: Use schedule_preempt_disabled() sched/rt: Add schedule_preempt_disabled() sched/rt: Do not throttle when PI boosting sched/rt: Keep period timer ticking when rt throttling is active ...	2012-03-20 10:31:44 -07:00
Jim Cromie	7fa87ce726	init: fix wording error in mm_init comment s/countinuous/contiguous/, reword sentence. Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2012-03-15 15:54:58 +01:00
Tejun Heo	32e380aedc	blkcg: make CONFIG_BLK_CGROUP bool Block cgroup core can be built as module; however, it isn't too useful as blk-throttle can only be built-in and cfq-iosched is usually the default built-in scheduler. Scheduled blkcg cleanup requires calling into blkcg from block core. To simplify that, disallow building blkcg as module by making CONFIG_BLK_CGROUP bool. If building blkcg core as module really matters, which I doubt, we can revisit it after blkcg API cleanup. -v2: Vivek pointed out that IOSCHED_CFQ was incorrectly updated to depend on BLK_CGROUP. Fixed. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2012-03-06 21:27:21 +01:00
Thomas Gleixner	bd2f55361f	sched/rt: Use schedule_preempt_disabled() Coccinelle based conversion. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-24swm5zut3h9c4a6s46x8rws@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-03-01 10:28:03 +01:00
Paul E. McKenney	5c8806a037	rcu: Move RCU_TRACE to lib/Kconfig.debug The RCU_TRACE kernel parameter has always been intended for debugging, not for production use. Formalize this by moving RCU_TRACE from init/Kconfig to lib/Kconfig.debug. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2012-02-21 09:03:26 -08:00
Linus Torvalds	f429ee3b80	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/audit: (29 commits) audit: no leading space in audit_log_d_path prefix audit: treat s_id as an untrusted string audit: fix signedness bug in audit_log_execve_info() audit: comparison on interprocess fields audit: implement all object interfield comparisons audit: allow interfield comparison between gid and ogid audit: complex interfield comparison helper audit: allow interfield comparison in audit rules Kernel: Audit Support For The ARM Platform audit: do not call audit_getname on error audit: only allow tasks to set their loginuid if it is -1 audit: remove task argument to audit_set_loginuid audit: allow audit matching on inode gid audit: allow matching on obj_uid audit: remove audit_finish_fork as it can't be called audit: reject entry,always rules audit: inline audit_free to simplify the look of generic code audit: drop audit_set_macxattr as it doesn't do anything audit: inline checks for not needing to collect aux records audit: drop some potentially inadvisable likely notations ... Use evil merge to fix up grammar mistakes in Kconfig file. Bad speling and horrible grammar (and copious swearing) is to be expected, but let's keep it to commit messages and comments, rather than expose it to users in config help texts or printouts.	2012-01-17 16:41:31 -08:00
Nathaniel Husted	29ef73b7a8	Kernel: Audit Support For The ARM Platform This patch provides functionality to audit system call events on the ARM platform. The implementation was based off the structure of the MIPS platform and information in this (http://lists.fedoraproject.org/pipermail/arm/2009-October/000382.html) mailing list thread. The required audit_syscall_exit and audit_syscall_entry checks were added to ptrace using the standard registers for system call values (r0 through r3). A thread information flag was added for auditing (TIF_SYSCALL_AUDIT) and a meta-flag was added (_TIF_SYSCALL_WORK) to simplify modifications to the syscall entry/exit. Now, if either the TRACE flag is set or the AUDIT flag is set, the syscall_trace function will be executed. The prober changes were made to Kconfig to allow CONFIG_AUDITSYSCALL to be enabled. Due to platform availability limitations, this patch was only tested on the Android platform running the modified "android-goldfish-2.6.29" kernel. A test compile was performed using Code Sourcery's cross-compilation toolset and the current linux-3.0 stable kernel. The changes compile without error. I'm hoping, due to the simple modifications, the patch is "obviously correct". Signed-off-by: Nathaniel Husted <nhusted@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>	2012-01-17 16:17:01 -05:00
Eric Paris	633b454545	audit: only allow tasks to set their loginuid if it is -1 At the moment we allow tasks to set their loginuid if they have CAP_AUDIT_CONTROL. In reality we want tasks to set the loginuid when they log in and it be impossible to ever reset. We had to make it mutable even after it was once set (with the CAP) because on update and admin might have to restart sshd. Now sshd would get his loginuid and the next user which logged in using ssh would not be able to set his loginuid. Systemd has changed how userspace works and allowed us to make the kernel work the way it should. With systemd users (even admins) are not supposed to restart services directly. The system will restart the service for them. Thus since systemd is going to loginuid==-1, sshd would get -1, and sshd would be allowed to set a new loginuid without special permissions. If an admin in this system were to manually start an sshd he is inserting himself into the system chain of trust and thus, logically, it's his loginuid that should be used! Since we have old systems I make this a Kconfig option. Signed-off-by: Eric Paris <eparis@redhat.com>	2012-01-17 16:17:00 -05:00
Linus Torvalds	0a80939b3e	Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQIcBAABAgAGBQJPD2aFAAoJENkgDmzRrbjxNzsQAIeYbbrXYLjr6kQzUSngj/eC FzjaTEfYTQIeuQCFJHcHthyc5lXV4sQbo3jOezW+Bp5yuDJL2aWIHesSfWZe7imu zQdM4VshOYdAmUR9Q0AW5zhB8Smbs7/AyABiF2jm4p0ZPOuyMDSlei9sjvE9Vjvt B7g5ht7L6kz0JbDnwwy0u5gs+tEitwpXYId9Y4ysZIBzIbL0qkPX8veOddGTMy0N 8xhWXaKtufpjvxFD2ORLDsw3AkoF1xXSNuFd/5nzCNpbeE7TW931jfkPoqJumuAO 7GLxcU9kKYl+IICobC6wBtsj/RrB7w+cBXMvPGwdBliam1qaRhUcJZi5FLM/Ha5d 2A9QDYNUpoXiO8JbPXrV9Z+Y0+Co8RilsQj7R/rjZh6AbbYCWt9nxzx2Svl/RfTr xfiimHuB2P3rHjOvpCXULwOOuE5c8MzPuWncpdjiD3uGXOY/aY+X1m+if/quJw9D grPlKL0+YiRakEYUeGG4M77KCqyKFZaF7L7UQPbqfZcj8V/9AW3/7U5I/B9RlAjs idsr4fcf5s0N+oKUyTCW1ncpUDQNiwbU2NyJQqeu1ZxaRGj72AgyvsaNeyIPDyK+ f6x95Bi7i8KLjXc9Z1KvJwh2Nxt25gNUiTYVha/9H2NpJGd1cfI15kTOGXrgddVv 1pvuGcJDZwYiwfiXr3FL =HHrh -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://github.com/rustyrussell/linux Autogenerated GPG tag for Rusty D1ADB8F1: 15EE 8D6C AB0E 7F0C F999 BFCB D920 0E6C D1AD B8F1 * tag 'for-linus' of git://github.com/rustyrussell/linux: module_param: check that bool parameters really are bool. intelfbdrv.c: bailearly is an int module_param paride/pcd: fix bool verbose module parameter. module_param: make bool parameters really bool (drivers & misc) module_param: make bool parameters really bool (arch) module_param: make bool parameters really bool (core code) kernel/async: remove redundant declaration. printk: fix unnecessary module_param_name. lirc_parallel: fix module parameter description. module_param: avoid bool abuse, add bint for special cases. module_param: check type correctness for module_param_array modpost: use linker section to generate table. modpost: use a table rather than a giant if/else statement. modules: sysfs - export: taint, coresize, initsize kernel/params: replace DEBUGP with pr_debug module: replace DEBUGP with pr_debug module: struct module_ref should contains long fields module: Fix performance regression on modules with large symbol tables module: Add comments describing how the "strmap" logic works Fix up conflicts in scripts/mod/file2alias.c due to the new linker- generated table approach to adding __mod_*_device_table entries. The ARM sa11x0 mcp bus needed to be converted to that too.	2012-01-14 12:32:16 -08:00
Cyrill Gorcunov	067bce1a06	c/r: introduce CHECKPOINT_RESTORE symbol For checkpoint/restore we need auxilary features being compiled into the kernel, such as additional prctl codes, /proc/<pid>/map_files and etc... but same time these features are not mandatory for a regular kernel so CHECKPOINT_RESTORE config symbol should bring a way to disable them all at once if one wish to get rid of additional functionality. Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Vagin <avagin@openvz.org> Cc: Serge Hallyn <serge.hallyn@canonical.com> Cc: Vasiliy Kulikov <segoon@openwall.com> Reviewed-by: Kees Cook <keescook@chromium.org> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2012-01-12 20:13:12 -08:00
Rusty Russell	2329abfa34	module_param: make bool parameters really bool (core code) module_param(bool) used to counter-intuitively take an int. In `fddd5201` (mid-2009) we allowed bool or int/unsigned int using a messy trick. It's time to remove the int/unsigned int option. For this version it'll simply give a warning, but it'll break next kernel version. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>	2012-01-13 09:32:18 +10:30
Linus Torvalds	b8bf17d311	Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched: Fix lockup by limiting load-balance retries on lock-break sched: Fix CONFIG_CGROUP_SCHED dependency sched: Remove empty #ifdefs	2012-01-11 22:52:48 -08:00
Linus Torvalds	9fc5c3e323	Merge branch 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/intel config: Fix the APB_TIMER selection x86/mrst: Add additional debug prints for pb_keys x86/intel config: Revamp configuration to allow for Moorestown and Medfield x86/intel/scu/ipc: Match the changes in the x86 configuration x86/apb: Fix configuration constraints x86: Fix INTEL_MID silly x86/Kconfig: Cyclone-timer depends on x86-summit x86: Reduce clock calibration time during slave cpu startup x86/config: Revamp configuration for MID devices x86/sfi: Kill the IRQ as id hack	2012-01-11 19:13:40 -08:00
Linus Torvalds	d0b9706c20	Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/numa: Add constraints check for nid parameters mm, x86: Remove debug_pagealloc_enabled x86/mm: Initialize high mem before free_all_bootmem() arch/x86/kernel/e820.c: quiet sparse noise about plain integer as NULL pointer arch/x86/kernel/e820.c: Eliminate bubble sort from sanitize_e820_map() x86: Fix mmap random address range x86, mm: Unify zone_sizes_init() x86, mm: Prepare zone_sizes_init() for unification x86, mm: Use max_low_pfn for ZONE_NORMAL on 64-bit x86, mm: Wrap ZONE_DMA32 with CONFIG_ZONE_DMA32 x86, mm: Use max_pfn instead of highend_pfn x86, mm: Move zone init from paging_init() on 64-bit x86, mm: Use MAX_DMA_PFN for ZONE_DMA on 32-bit	2012-01-11 19:12:10 -08:00
Linus Torvalds	57eccf1c2a	Merge branch 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs * 'nfs-for-3.3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: NFSv4: Change the default setting of the nfs4_disable_idmapping parameter NFSv4: Save the owner/group name string when doing open NFS: Remove pNFS bloat from the generic write path pnfs-obj: Must return layout on IO error pnfs-obj: pNFS errors are communicated on iodata->pnfs_error NFS: Cache state owners after files are closed NFS: Clean up nfs4_find_state_owners_locked() NFSv4: include bitmap in nfsv4 get acl data nfs: fix a minor do_div portability issue NFSv4.1: cleanup comment and debug printk NFSv4.1: change nfs4_free_slot parameters for dynamic slots NFSv4.1: cleanup init and reset of session slot tables NFSv4.1: fix backchannel slotid off-by-one bug nfs: fix regression in handling of context= option in NFSv4 NFS - fix recent breakage to NFS error handling. NFS: Retry mounting NFSROOT SUNRPC: Clean up the RPCSEC_GSS service ticket requests	2012-01-10 14:57:40 -08:00
Fabio Estevam	6db9dc150e	sched: Fix CONFIG_CGROUP_SCHED dependency The dependency bug was pointed out by this build warning: warning: (SCHED_AUTOGROUP) selects CGROUP_SCHED which has unmet direct dependencies (CGROUPS && EXPERIMENTAL) Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com> Cc: a.p.zijlstra@chello.nl Cc: Fabio Estevam <festevam@gmail.com> Link: http://lkml.kernel.org/r/1326192383-5113-1-git-send-email-festevam@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2012-01-10 13:44:50 +01:00
Linus Torvalds	972b2c7199	Merge branch 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (165 commits) reiserfs: Properly display mount options in /proc/mounts vfs: prevent remount read-only if pending removes vfs: count unlinked inodes vfs: protect remounting superblock read-only vfs: keep list of mounts for each superblock vfs: switch ->show_options() to struct dentry * vfs: switch ->show_path() to struct dentry * vfs: switch ->show_devname() to struct dentry * vfs: switch ->show_stats to struct dentry * switch security_path_chmod() to struct path * vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb vfs: trim includes a bit switch mnt_namespace ->root to struct mount vfs: take /proc//mounts and friends to fs/proc_namespace.c vfs: opencode mntget() mnt_set_mountpoint() vfs: spread struct mount - remaining argument of next_mnt() vfs: move fsnotify junk to struct mount vfs: move mnt_devname vfs: move mnt_list to struct mount vfs: switch pnode.h macros to struct mount ...	2012-01-08 12:19:57 -08:00
Al Viro	d8c9584ea2	vfs: prefer ->dentry->d_sb to ->mnt->mnt_sb Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-06 23:16:53 -05:00
Linus Torvalds	9753dfe19a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1958 commits) net: pack skb_shared_info more efficiently net_sched: red: split red_parms into parms and vars net_sched: sfq: extend limits cnic: Improve error recovery on bnx2x devices cnic: Re-init dev->stats_addr after chip reset net_sched: Bug in netem reordering bna: fix sparse warnings/errors bna: make ethtool_ops and strings const xgmac: cleanups net: make ethtool_ops const vmxnet3" make ethtool ops const xen-netback: make ops structs const virtio_net: Pass gfp flags when allocating rx buffers. ixgbe: FCoE: Add support for ndo_get_fcoe_hbainfo() call netdev: FCoE: Add new ndo_get_fcoe_hbainfo() call igb: reset PHY after recovering from PHY power down igb: add basic runtime PM support igb: Add support for byte queue limits. e1000: cleanup CE4100 MDIO registers access e1000: unmap ce4100_gbe_mdio_base_virt in e1000_remove ...	2012-01-06 17:22:09 -08:00
Linus Torvalds	423d091dfe	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) cpu: Export cpu_up() rcu: Apply ACCESS_ONCE() to rcu_boost() return value Revert "rcu: Permit rt_mutex_unlock() with irqs disabled" docs: Additional LWN links to RCU API rcu: Augment rcu_batch_end tracing for idle and callback state rcu: Add rcutorture tests for srcu_read_lock_raw() rcu: Make rcutorture test for hotpluggability before offlining CPUs driver-core/cpu: Expose hotpluggability to the rest of the kernel rcu: Remove redundant rcu_cpu_stall_suppress declaration rcu: Adaptive dyntick-idle preparation rcu: Keep invoking callbacks if CPU otherwise idle rcu: Irq nesting is always 0 on rcu_enter_idle_common rcu: Don't check irq nesting from rcu idle entry/exit rcu: Permit dyntick-idle with callbacks pending rcu: Document same-context read-side constraints rcu: Identify dyntick-idle CPUs on first force_quiescent_state() pass rcu: Remove dynticks false positives and RCU failures rcu: Reduce latency of rcu_prepare_for_idle() rcu: Eliminate RCU_FAST_NO_HZ grace-period hang rcu: Avoid needlessly IPIing CPUs at GP end ...	2012-01-06 08:02:40 -08:00
Chuck Lever	43717c7dae	NFS: Retry mounting NFSROOT Lukas Razik <linux@razik.name> reports that on his SPARC system, booting with an NFS root file system stopped working after commit `56463e50` "NFS: Use super.c for NFSROOT mount option parsing." We found that the network switch to which Lukas' client was attached was delaying access to the LAN after the client's NIC driver reported that its link was up. The delay was longer than the timeouts used in the NFS client during mounting. NFSROOT worked for Lukas before commit `56463e50` because in those kernels, the client's first operation was an rpcbind request to determine which port the NFS server was listening on. When that request failed after a long timeout, the client simply selected the default NFS port (2049). By that time the switch was allowing access to the LAN, and the mount succeeded. Neither of these client behaviors is desirable, so reverting `56463e50` is really not a choice. Instead, introduce a mechanism that retries the NFSROOT mount request several times. This is the same tactic that normal user space NFS mounts employ to overcome server and network delays. Signed-off-by: Lukas Razik <linux@razik.name> [ cel: match kernel coding style, add proper patch description ] [ cel: add exponential back-off ] Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Tested-by: Lukas Razik <linux@razik.name> Cc: stable@kernel.org # > 2.6.38 Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2012-01-05 10:42:39 -05:00
Al Viro	685dd2d5be	init/initramfs.c: should use umode_t Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-01-03 22:55:14 -05:00
Glauber Costa	e5671dfae5	Basic kernel memory functionality for the Memory Controller This patch lays down the foundation for the kernel memory component of the Memory Controller. As of today, I am only laying down the following files: * memory.independent_kmem_limit * memory.kmem.limit_in_bytes (currently ignored) * memory.kmem.usage_in_bytes (always zero) Signed-off-by: Glauber Costa <glommer@parallels.com> CC: Kirill A. Shutemov <kirill@shutemov.name> CC: Paul Menage <paul@paulmenage.org> CC: Greg Thelen <gthelen@google.com> CC: Johannes Weiner <jweiner@redhat.com> CC: Michal Hocko <mhocko@suse.cz> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-12-12 19:03:55 -05:00
Paul E. McKenney	b807fbff31	rcu: Permit RCU_FAST_NO_HZ to be used by TREE_PREEMPT_RCU The new implementation of RCU_FAST_NO_HZ is compatible with preemptible RCU, so this commit removes the Kconfig restriction that previously prohibited this. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-12-11 10:31:45 -08:00
Stanislaw Gruszka	54c29c635a	mm, x86: Remove debug_pagealloc_enabled When (no)bootmem finish operation, it pass pages to buddy allocator. Since debug_pagealloc_enabled is not set, we will do not protect pages, what is not what we want with CONFIG_DEBUG_PAGEALLOC=y. To fix remove debug_pagealloc_enabled. That variable was introduced by commit `12d6f21e` "x86: do not PSE on CONFIG_DEBUG_PAGEALLOC=y" to get more CPA (change page attribude) code testing. But currently we have CONFIG_CPA_DEBUG, which test CPA. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Acked-by: Mel Gorman <mgorman@suse.de> Cc: linux-mm@kvack.org Link: http://lkml.kernel.org/r/1322582711-14571-1-git-send-email-sgruszka@redhat.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-06 09:24:07 +01:00
Ming Lei	73839c5b2e	init/main.c: Execute lockdep_init() as early as possible This patch fixes a lockdep warning on ARM platforms: [ 0.000000] WARNING: lockdep init error! Arch code didn't call lockdep_init() early enough? [ 0.000000] Call stack leading to lockdep invocation was: [ 0.000000] [<c00164bc>] save_stack_trace_tsk+0x0/0x90 [ 0.000000] [<ffffffff>] 0xffffffff The warning is caused by printk inside smp_setup_processor_id(). It is safe to do this because lockdep_init() doesn't depend on smp_setup_processor_id(), so improve things that printk can be called as early as possible without lockdep complaint. Signed-off-by: Ming Lei <tom.leiming@gmail.com> Reviewed-by: Yong Zhang <yong.zhang0@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/1321508072-23853-1-git-send-email-tom.leiming@gmail.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-06 08:16:53 +01:00
Jack Steiner	b565201cf7	x86: Reduce clock calibration time during slave cpu startup Reduce the startup time for slave cpus. Adds hooks for an arch-specific function for clock calibration. These hooks are used on x86. If a newly started cpu has the same phys_proc_id as a core already active, uses the TSC for the delay loop and has a CONSTANT_TSC, use the already-calculated value of loops_per_jiffy. This patch reduces the time required to start slave cpus on a 4096 cpu system from: 465 sec OLD 62 sec NEW This reduces boot time on a 4096p system by almost 7 minutes. Nice... Signed-off-by: Jack Steiner <steiner@sgi.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: John Stultz <john.stultz@linaro.org> [fix CONFIG_SMP=n build] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-12-05 17:12:43 +01:00
Linus Torvalds	b32fc0a062	Merge branch 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen * 'upstream/jump-label-noearly' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen: jump-label: initialize jump-label subsystem much earlier x86/jump_label: add arch_jump_label_transform_static() s390/jump-label: add arch_jump_label_transform_static() jump_label: add arch_jump_label_transform_static() to optimise non-live code updates sparc/jump_label: drop arch_jump_label_text_poke_early() x86/jump_label: drop arch_jump_label_text_poke_early() jump_label: if a key has already been initialized, don't nop it out stop_machine: make stop_machine safe and efficient to call early jump_label: use proper atomic_t initializer Conflicts: - arch/x86/kernel/jump_label.c Added __init_or_module to arch_jump_label_text_poke_early vs removal of that function entirely - kernel/stop_machine.c same patch ("stop_machine: make stop_machine safe and efficient to call early") merged twice, with whitespace fix in one version	2011-11-06 20:20:46 -08:00
WANG Cong	c736de60ae	sysctl: make CONFIG_SYSCTL_SYSCALL default to n When I tried to send a patch to remove it, Andi told me we still need to keep compabitlies for old libc, so we can't remove this completely. Then just make it default to n and remove the doc from feature-removal-schedule.txt. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Eric Biederman <ebiederm@xmission.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-02 16:07:02 -07:00
Will Drewry	79975f1327	init: add root=PARTUUID=UUID/PARTNROFF=%d support Expand root=PARTUUID=UUID syntax to support selecting a root partition by integer offset from a known, unique partition. This approach provides similar properties to specifying a device and partition number, but using the UUID as the unique path prior to evaluating the offset. For example, root=PARTUUID=99DE9194-FC15-4223-9192-FC243948F88B/PARTNROFF=1 selects the partition with UUID 99DE.. then select the next partition. This change is motivated by a particular usecase in Chromium OS where the bootloader can easily determine what partition it is on (by UUID) but doesn't perform general partition table walking. That said, support for this model provides a direct mechanism for the user to modify the root partition to boot without specifically needing to extract each UUID or update the bootloader explicitly when the root partition UUID is changed (if it is recreated to be larger, for instance). Pinning to a /boot-style partition UUID allows the arbitrary root partition reconfiguration/modifications with slightly less ambiguity than just [dev][partition] and less stringency than the specific root partition UUID. [sfr@canb.auug.org.au: fix init sections warning] Signed-off-by: Will Drewry <wad@chromium.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Cc: Randy Dunlap <rdunlap@xenotime.net> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-02 16:07:01 -07:00
Neil Armstrong	f919b9235f	init/do_mounts_rd.c: fix ramdisk identification for padded cramfs When a cramfs ramdisk padded with 512 bytes is given to the kernel, the current identify_ramdisk_image function fails to identify it. Tested with a padded cramfs image on an ARM based board. Signed-off-by: Neil Armstrong <narmstrong@neotion.com> Cc: Namhyung Kim <namhyung@gmail.com> Cc: Davidlohr Bueso <dave@gnu.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-11-02 16:06:58 -07:00
Linus Torvalds	8a4a8918ed	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) llist: Add back llist_add_batch() and llist_del_first() prototypes sched: Don't use tasklist_lock for debug prints sched: Warn on rt throttling sched: Unify the ->cpus_allowed mask copy sched: Wrap scheduler p->cpus_allowed access sched: Request for idle balance during nohz idle load balance sched: Use resched IPI to kick off the nohz idle balance sched: Fix idle_cpu() llist: Remove cpu_relax() usage in cmpxchg loops sched: Convert to struct llist llist: Add llist_next() irq_work: Use llist in the struct irq_work logic llist: Return whether list is empty before adding in llist_add() llist: Move cpu_relax() to after the cmpxchg() llist: Remove the platform-dependent NMI checks llist: Make some llist functions inline sched, tracing: Show PREEMPT_ACTIVE state in trace_sched_switch sched: Remove redundant test in check_preempt_tick() sched: Add documentation for bandwidth control sched: Return unused runtime on group dequeue ...	2011-10-26 17:08:43 +02:00
Linus Torvalds	19b4a8d520	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits) rcu: Move propagation of ->completed from rcu_start_gp() to rcu_report_qs_rsp() rcu: Remove rcu_needs_cpu_flush() to avoid false quiescent states rcu: Wire up RCU_BOOST_PRIO for rcutree rcu: Make rcu_torture_boost() exit loops at end of test rcu: Make rcu_torture_fqs() exit loops at end of test rcu: Permit rt_mutex_unlock() with irqs disabled rcu: Avoid having just-onlined CPU resched itself when RCU is idle rcu: Suppress NMI backtraces when stall ends before dump rcu: Prohibit grace periods during early boot rcu: Simplify unboosting checks rcu: Prevent early boot set_need_resched() from __rcu_pending() rcu: Dump local stack if cannot dump all CPUs' stacks rcu: Move __rcu_read_unlock()'s barrier() within if-statement rcu: Improve rcu_assign_pointer() and RCU_INIT_POINTER() documentation rcu: Make rcu_assign_pointer() unconditionally insert a memory barrier rcu: Make rcu_implicit_dynticks_qs() locals be correct size rcu: Eliminate in_irq() checks in rcu_enter_nohz() nohz: Remove nohz_cpu_mask rcu: Document interpretation of RCU-lockdep splats rcu: Allow rcutorture's stat_interval parameter to be changed at runtime ...	2011-10-26 16:26:53 +02:00
Michal Schmidt	b1e4d20cbf	params: make dashes and underscores in parameter names truly equal The user may use "foo-bar" for a kernel parameter defined as "foo_bar". Make sure it works the other way around too. Apply the equality of dashes and underscores on early_params and __setup params as well. The example given in Documentation/kernel-parameters.txt indicates that this is the intended behaviour. With the patch the kernel accepts "log-buf-len=1M" as expected. https://bugzilla.redhat.com/show_bug.cgi?id=744545 Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> (neatened implementations)	2011-10-26 13:10:39 +10:30
Jeremy Fitzhardinge	97ce2c88f9	jump-label: initialize jump-label subsystem much earlier Initialize jump_labels much, much earlier, so they're available for use during system setup. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>	2011-10-25 11:55:15 -07:00
Ingo Molnar	22f92bacbe	Merge branch 'linus' into sched/core Merge reason: pick up the latest fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-10-04 11:09:08 +02:00
Ingo Molnar	048b718029	Merge branch 'rcu/next' of git://github.com/paulmckrcu/linux into core/rcu	2011-10-01 14:21:36 +02:00
wangyanqing	b0f84374b6	bootup: move 'usermodehelper_enable()' a little earlier Commit `d5767c5353` ("bootup: move 'usermodehelper_enable()' to the end of do_basic_setup()") moved 'usermodehelper_enable()' to end of do_basic_setup() to after the initcalls. But then I get failed to let uvesafb work on my computer, and lose the splash boot. So maybe we could start usermodehelper_enable a little early to make some task work that need eary init with the help of user mode. [ I would really prefer that initcalls not call into user space - even the real 'init' hasn't been execve'd yet, after all! But for uvesafb it really does look like we don't have much choice. I considered doing this when we mount the root filesystem, but depending on config options that is in multiple places. We could do the usermode helper enable as a rootfs_initcall().. So I'm just using wang yanqing's trivial patch. It's not wonderful, but it's simple and should work. We should revisit this some day, though. - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-29 19:21:01 -07:00
Paul E. McKenney	8008e129dc	rcu: Drive configuration directly from SMP and PREEMPT This commit eliminates the possibility of running TREE_PREEMPT_RCU when SMP=n and of running TINY_RCU when PREEMPT=y. People who really want these combinations can hand-edit init/Kconfig, but eliminating them as choices for production systems reduces the amount of testing required. It will also allow cutting out a few #ifdefs. Note that running TREE_RCU and TINY_RCU on single-CPU systems using SMP-built kernels is still supported. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-09-28 21:36:45 -07:00
Linus Torvalds	d5767c5353	bootup: move 'usermodehelper_enable()' to the end of do_basic_setup() Doing it just before starting to call into cpu_idle() made a sick kind of sense only because the original bug we fixed (see commit `288d5abec8`: "Boot up with usermodehelper disabled") was about problems with some scheduler data structures not being initialized, and they had better be initialized at that point. But it really didn't make any other conceptual sense, and doing it after the initial "schedule()" call for the idle thread actually opened up a race: what if the main initialization thread did everything without needing to sleep, and got all the way into user land too? Without actually having scheduled back to the idle thread? Now, in normal circumstances that doesn't ever happen, but it looks like Richard Cochran triggered exactly that on his ARM IXP4xx machines: "I have some ARM IXP4xx based machines that use the two on chip MAC ports (aka NPEs). The NPE needs a firmware in order to function. Ever since the following commit [that `288d5abec8` one], it is no longer possible to bring up the interfaces during the init scripts." with a call trace showing an ioctl coming from user space. Richard says: "The init is busybox, and the startup script does mount, syslogd, and then ifup, so that all can go by quickly." The fix is to move the usermodehelper_enable() into the main 'init' thread, and just put it after we've done all our initcalls. By then, everything really should be up, but we've obviously not actually started the user-mode portion of init yet. Reported-and-tested-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-28 10:23:44 -07:00
Alexander Sverdlin	808bf29b91	init: carefully handle loglevel option on kernel cmdline. When a malformed loglevel value (for example "${abc}") is passed on the kernel cmdline, the loglevel itself is being set to 0. That then suppresses all following messages, including all the errors and crashes caused by other malformed cmdline options. This could make debugging process quite tricky. This patch leaves the previous value of loglevel if the new value is incorrect and reports an error code in this case. Signed-off-by: Alexander Sverdlin <alexander.sverdlin@sysgo.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-09-21 13:18:52 -07:00
Paul Turner	ab84d31e15	sched: Introduce primitives to account for CFS bandwidth tracking In this patch we introduce the notion of CFS bandwidth, partitioned into globally unassigned bandwidth, and locally claimed bandwidth. - The global bandwidth is per task_group, it represents a pool of unclaimed bandwidth that cfs_rqs can allocate from. - The local bandwidth is tracked per-cfs_rq, this represents allotments from the global pool bandwidth assigned to a specific cpu. Bandwidth is managed via cgroupfs, adding two new interfaces to the cpu subsystem: - cpu.cfs_period_us : the bandwidth period in usecs - cpu.cfs_quota_us : the cpu bandwidth (in usecs) that this tg will be allowed to consume over period above. Signed-off-by: Paul Turner <pjt@google.com> Signed-off-by: Nikhil Rao <ncrao@google.com> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110721184756.972636699@google.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-08-14 12:03:20 +02:00
Linus Torvalds	288d5abec8	Boot up with usermodehelper disabled The core device layer sends tons of uevent notifications for each device it finds, and if the kernel has been built with a non-empty CONFIG_UEVENT_HELPER_PATH that will make us try to execute the usermode helper binary for all these events very early in the boot. Not only won't the root filesystem even be mounted at that point, we literally won't have necessarily even initialized all the process handling data structures at that point, which causes no end of silly problems even when the usermode helper doesn't actually succeed in executing. So just use our existing infrastructure to disable the usermodehelpers to make the kernel start out with them disabled. We enable them when we've at least initialized stuff a bit. Problems related to an uninitialized init_ipc_ns.ids[IPC_SHM_IDS].rw_mutex reported by various people. Reported-by: Manuel Lauss <manuel.lauss@googlemail.com> Reported-by: Richard Weinberger <richard@nod.at> Reported-by: Marc Zyngier <maz@misterjones.org> Acked-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Vasiliy Kulikov <segoon@openwall.com> Cc: Greg KH <greg@kroah.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-03 22:03:29 -10:00
Hugh Dickins	41ffe5d5ce	tmpfs: miscellaneous trivial cleanups While it's at its least, make a number of boring nitpicky cleanups to shmem.c, mostly for consistency of variable naming. Things like "swap" instead of "entry", "pgoff_t index" instead of "unsigned long idx". And since everything else here is prefixed "shmem_", better change init_tmpfs() to shmem_init(). Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Rik van Riel <riel@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-08-03 14:25:23 -10:00
Sameer Nanda	7afe1845dd	init: skip calibration delay if previously done For each CPU, do the calibration delay only once. For subsequent calls, use the cached per-CPU value of loops_per_jiffy. This saves about 200ms of resume time on dual core Intel Atom N5xx based systems. This helps bring down the kernel resume time on such systems from about 500ms to about 300ms. [akpm@linux-foundation.org: make cpu_loops_per_jiffy static] [akpm@linux-foundation.org: clean up message text] [akpm@linux-foundation.org: fix things up after upstream rmk changes] Signed-off-by: Sameer Nanda <snanda@chromium.org> Cc: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Andrew Worsley <amworsley@gmail.com> Cc: David Daney <ddaney@caviumnetworks.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-25 20:57:17 -07:00
WANG Cong	00a66d2974	mm: remove the leftovers of noswapaccount In commit `a2c8990aed` ("memsw: remove noswapaccount kernel parameter"), Michal forgot to remove some left pieces of noswapaccount in the tree, this patch removes them all. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-07-25 20:57:09 -07:00
Linus Torvalds	c0c463d34a	Merge branches 'x86-urgent-for-linus', 'core-debug-for-linus', 'irq-core-for-linus' and 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: um: Make rwsem.S depend on CONFIG_RWSEM_XCHGADD_ALGORITHM * 'core-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: debug: Make CONFIG_EXPERT select CONFIG_DEBUG_KERNEL to unhide debug options * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: genirq: Remove unused CHECK_IRQ_PER_CPU() * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: perf tools, x86: Fix 32-bit compile on 64-bit system	2011-07-23 10:33:08 -07:00
Linus Torvalds	a99a7d1436	Merge branch 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: mips: Fix i8253 clockevent fallout i8253: Cleanup outb/inb magic arm: Footbridge: Use common i8253 clockevent mips: Use common i8253 clockevent x86: Use common i8253 clockevent i8253: Create common clockevent implementation i8253: Export i8253_lock unconditionally pcpskr: MIPS: Make config dependencies finer grained pcspkr: Cleanup Kconfig dependencies i8253: Move remaining content and delete asm/i8253.h i8253: Consolidate definitions of PIT_LATCH x86: i8253: Consolidate definitions of global_clock_event i8253: Alpha, PowerPC: Remove unused asm/8253pit.h alpha: i8253: Cleanup remaining users of i8253pit.h i8253: Remove I8253_LOCK config i8253: Make pcsp sound driver use the shared i8253_lock i8253: Make pcspkr input driver use the shared i8253_lock i8253: Consolidate all kernel definitions of i8253_lock i8253: Unify all kernel declarations of i8253_lock i8253: Create linux/i8253.h and use it in all 8253 related files	2011-07-22 16:51:56 -07:00
Russell King	1b19ca9f0b	Fix CPU spinlock lockups on secondary CPU bringup Secondary CPU bringup typically calls calibrate_delay() during its initialization. However, calibrate_delay() modifies a global variable (loops_per_jiffy) used for udelay() and __delay(). A side effect of `71c696b1` ("calibrate: extract fall-back calculation into own helper") introduced in the 2.6.39 merge window means that we end up with a substantial period where loops_per_jiffy is zero. This causes the spinlock debugging code to malfunction: u64 loops = loops_per_jiffy * HZ; for (;;) { for (i = 0; i < loops; i++) { if (arch_spin_trylock(&lock->raw_lock)) return; __delay(1); } ... } by never calling arch_spin_trylock() - resulting in the CPU locking up in an infinite loop inside __spin_lock_debug(). Work around this by only writing to loops_per_jiffy only once we have completed all the calibration decisions. Tested-by: Santosh Shilimkar <santosh.shilimkar@ti.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: <stable@kernel.org> (2.6.39-stable) -- Better solutions (such as omitting the calibration for secondary CPUs, or arranging for calibrate_delay() to return the LPJ value and leave it to the caller to decide where to store it) are a possibility, but would be much more invasive into each architecture. I think this is the best solution for -rc and stable, but it should be revisited for the next merge window. init/calibrate.c \| 14 ++++++++------ 1 files changed, 8 insertions(+), 6 deletions(-) Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-23 08:59:38 -07:00
Takao Indoh	d8ad7d1123	generic-ipi: Fix kexec boot crash by initializing call_single_queue before enabling interrupts There is a problem that kdump(2nd kernel) sometimes hangs up due to a pending IPI from 1st kernel. Kernel panic occurs because IPI comes before call_single_queue is initialized. To fix the crash, rename init_call_single_data() to call_function_init() and call it in start_kernel() so that call_single_queue can be initialized before enabling interrupts. The details of the crash are: (1) 2nd kernel boots up (2) A pending IPI from 1st kernel comes when irqs are first enabled in start_kernel(). (3) Kernel tries to handle the interrupt, but call_single_queue is not initialized yet at this point. As a result, in the generic_smp_call_function_single_interrupt(), NULL pointer dereference occurs when list_replace_init() tries to access &q->list.next. Therefore this patch changes the name of init_call_single_data() to call_function_init() and calls it before local_irq_enable() in start_kernel(). Signed-off-by: Takao Indoh <indou.takao@jp.fujitsu.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Acked-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Milton Miller <miltonm@bga.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: kexec@lists.infradead.org Link: http://lkml.kernel.org/r/D6CBEE2F420741indou.takao@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-17 10:17:12 +02:00
Josh Triplett	d2c3225879	gcov: disable CONFIG_CONSTRUCTORS when not needed by CONFIG_GCOV_KERNEL CONFIG_CONSTRUCTORS controls support for running constructor functions at kernel init time. According to commit `b99b87f70c` ("kernel: constructor support"), gcov (CONFIG_GCOV_KERNEL) needs this. However, CONFIG_CONSTRUCTORS currently defaults to y, with no option to disable it, and CONFIG_GCOV_KERNEL depends on it. Instead, default it to n and have CONFIG_GCOV_KERNEL select it, so that the normal case of CONFIG_GCOV_KERNEL=n will result in CONFIG_CONSTRUCTORS=n. Observed in the short list of =y values in a minimal kernel configuration. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-15 20:04:01 -07:00
Borislav Petkov	de695e159e	init/calibrate.c: remove annoying printk Remove calibrate_delay_direct()'s KERN_DEBUG printk related to bogomips calculation as it appears when booting every core on setups with 'ignore_loglevel' which dmesg people scan for possible issues. As the message doesn't show very useful information to the widest audience of kernel boot message gazers, it should be removed. Introduced by commit `d2b463135f` ("init/calibrate.c: fix for critical bogoMIPS intermittent calculation failure"). Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> Cc: Andrew Worsley <amworsley@gmail.com> Cc: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-15 20:04:01 -07:00
Josh Triplett	bd5dc17be8	uts: make default hostname configurable, rather than always using "(none)" The "hostname" tool falls back to setting the hostname to "localhost" if /etc/hostname does not exist. Distribution init scripts have the same fallback. However, if userspace never calls sethostname, such as when booting with init=/bin/sh, or otherwise booting a minimal system without the usual init scripts, the default hostname of "(none)" remains, unhelpfully appearing in various places such as prompts ("root@(none):~#") and logs. Furthermore, "(none)" doesn't typically resolve to anything useful. Make the default hostname configurable. This removes the need for the standard fallback, provides a useful default for systems that never call sethostname, and makes minimal systems that much more useful with less configuration. Distributions could choose to use "localhost" here to avoid the fallback, while embedded systems may wish to use a specific target hostname. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: David Miller <davem@davemloft.net> Cc: Serge Hallyn <serue@us.ibm.com> Cc: Kel Modderman <kel@otaku42.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-06-15 20:04:00 -07:00
Ralf Baechle	8761f1ab71	pcspkr: Cleanup Kconfig dependencies Lenghty lists of the kind "depends on ARCH1 \|\| ARCH2 ... \|\| ARCH123" are usually either wrong or too coarse grained. Or plain an ugly sin. [ tglx: Fixed up amigaone ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Matt Turner <mattst88@gmail.com> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linux-alpha@vger.kernel.org Cc: linux-mips@linux-mips.org Cc: linuxppc-dev@lists.ozlabs.org Cc: Gerhard Pircher <gerhard_pircher@gmx.net> Link: http://lkml.kernel.org/r/20110601180610.984881988@duck.linux-mips.net Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-06-09 15:01:41 +02:00
Ralf Baechle	15f304b664	i8253: Consolidate all kernel definitions of i8253_lock Move them to drivers/clocksource/i8253.c and remove the implementations in arch/ [ tglx: Avoid the extra file in lib - folded arch patches in. The export will become conditional in a later step ] Signed-off-by: Ralf Baechle <ralf@linux-mips.org> Link: http://lkml.kernel.org/r/20110601180610.221426078@duck.linux-mips.net Cc: Russell King <linux@arm.linux.org.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2011-06-09 15:01:38 +02:00
Josh Triplett	f505c553db	debug: Make CONFIG_EXPERT select CONFIG_DEBUG_KERNEL to unhide debug options Several debugging options currently default to y, such as CONFIG_DEBUG_BUGVERBOSE and CONFIG_DEBUG_RODATA. Embedded users might want to turn those options off to save space; however, turning them off requires turning on CONFIG_DEBUG_KERNEL to unhide them. Since CONFIG_DEBUG_KERNEL exists specifically to unhide debugging options, and CONFIG_EXPERT exists specifically to unhide options potentially needed by experts and/or embedded users, make CONFIG_EXPERT automatically imply CONFIG_DEBUG_KERNEL. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-by: Frederic Weisbecker <fweisbec@gmail.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/20110606012358.GA1909@leaf Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-06-07 00:05:15 +02:00
Linus Torvalds	6345d24daf	mm: Fix boot crash in mm_alloc() Thomas Gleixner reports that we now have a boot crash triggered by CONFIG_CPUMASK_OFFSTACK=y: BUG: unable to handle kernel NULL pointer dereference at (null) IP: [<c11ae035>] find_next_bit+0x55/0xb0 Call Trace: [<c11addda>] cpumask_any_but+0x2a/0x70 [<c102396b>] flush_tlb_mm+0x2b/0x80 [<c1022705>] pud_populate+0x35/0x50 [<c10227ba>] pgd_alloc+0x9a/0xf0 [<c103a3fc>] mm_init+0xec/0x120 [<c103a7a3>] mm_alloc+0x53/0xd0 which was introduced by commit `de03c72cfc` ("mm: convert mm->cpu_vm_cpumask into cpumask_var_t"), and is due to wrong ordering of mm_init() vs mm_init_cpumask Thomas wrote a patch to just fix the ordering of initialization, but I hate the new double allocation in the fork path, so I ended up instead doing some more radical surgery to clean it all up. Reported-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Ingo Molnar <mingo@elte.hu> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-29 11:32:28 -07:00
Daniel Lezcano	a77aea9201	cgroup: remove the ns_cgroup The ns_cgroup is an annoying cgroup at the namespace / cgroup frontier and leads to some problems: * cgroup creation is out-of-control * cgroup name can conflict when pids are looping * it is not possible to have a single process handling a lot of namespaces without falling in a exponential creation time * we may want to create a namespace without creating a cgroup The ns_cgroup was replaced by a compatibility flag 'clone_children', where a newly created cgroup will copy the parent cgroup values. The userspace has to manually create a cgroup and add a task to the 'tasks' file. This patch removes the ns_cgroup as suggested in the following thread: https://lists.linux-foundation.org/pipermail/containers/2009-June/018616.html The 'cgroup_clone' function is removed because it is no longer used. This is a userspace-visible change. Commit `45531757b4` ("cgroup: notify ns_cgroup deprecated") (merged into 2.6.27) caused the kernel to emit a printk warning users that the feature is planned for removal. Since that time we have heard from XXX users who were affected by this. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Jamal Hadi Salim <hadi@cyberus.ca> Reviewed-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Paul Menage <menage@google.com> Acked-by: Matt Helsley <matthltc@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-26 17:12:34 -07:00
Mike Travis	162a7e7500	printk: allocate kernel log buffer earlier On larger systems, because of the numerous ACPI, Bootmem and EFI messages, the static log buffer overflows before the larger one specified by the log_buf_len param is allocated. Minimize the overflow by allocating the new log buffer as soon as possible. On kernels without memblock, a later call to setup_log_buf from kernel/init.c is the fallback. [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix CONFIG_PRINTK=n build] Signed-off-by: Mike Travis <travis@sgi.com> Cc: Yinghai Lu <yhlu.kernel@gmail.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Jack Steiner <steiner@sgi.com> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-25 08:39:48 -07:00
Andrew Worsley	d2b463135f	init/calibrate.c: fix for critical bogoMIPS intermittent calculation failure A fix to the TSC (Time Stamp Counter) based bogoMIPS calculation used on secondary CPUs which has two faults: 1: Not handling wrapping of the lower 32 bits of the TSC counter on 32bit kernel - perhaps TSC is not reset by a warm reset? 2: TSC and Jiffies are no incrementing together properly. Either jiffies increment too quickly or Time Stamp Counter isn't incremented in during an SMI but the real time clock is and jiffies are incremented. Case 1 can result in a factor of 16 too large a value which makes udelay() values too small and can cause mysterious driver errors. Case 2 appears to give smaller 10-15% errors after averaging but enough to cause occasional failures on my own board I have tested this code on my own branch and attach patch suitable for current kernel code. See below for examples of the failures and how the fix handles these situations now. I reported this issue earlier here: Intermittent problem with BogoMIPs calculation on Intel AP CPUs - http://marc.info/?l=linux-kernel&m=129947246316875&w=4 I suspect this issue has been seen by others but as it is intermittent and bogoMIPS for secondary CPUs are no longer printed out it might have been difficult to identify this as the cause. Perhaps these unresolved issues, although quite old, might be relevant as possibly this fault has been around for a while. In particular Case 1 may only be relevant to 32bit kernels on newer HW (most people run 64bit kernels?). Case 2 is less dramatic since the earlier fix in this area and also intermittent. Re: bogomips discrepancy on Intel Core2 Quad CPU - http://marc.info/?l=linux-kernel&m=118929277524298&w=4 slow system and bogus bogomips - http://marc.info/?l=linux-kernel&m=116791286716107&w=4 Re: Re: [RFC-PATCH] clocksource: update lpj if clocksource has - http://marc.info/?l=linux-kernel&m=128952775819467&w=4 This issue is masked a little by commit `feae3203d7` ("timers, init: Limit the number of per cpu calibration bootup messages") which only prints out the first bogoMIPS value making it much harder to notice other values differing. Perhaps it should be changed to only suppress them when they are similar values? Here are some outputs showing faults occurring and the new code handling them properly. See my earlier message for examples of the original failure. Case 1: A Time Stamp Counter wrap: ... Calibrating delay loop (skipped), value calculated using timer frequency.. 6332.70 BogoMIPS (lpj=31663540) .... calibrate_delay_direct() timer_rate_max=31666493 timer_rate_min=31666151 pre_start=4170369255 pre_end=4202035539 calibrate_delay_direct() timer_rate_max=2425955274 timer_rate_min=2425954941 pre_start=4265368533 pre_end=2396356387 calibrate_delay_direct() ignoring timer_rate as we had a TSC wrap around start=4265368581 >=post_end=2396356511 calibrate_delay_direct() timer_rate_max=31666274 timer_rate_min=31665942 pre_start=2440373374 pre_end=2472039515 calibrate_delay_direct() timer_rate_max=31666492 timer_rate_min=31666160 pre_start=2535372139 pre_end=2567038422 calibrate_delay_direct() timer_rate_max=31666455 timer_rate_min=31666207 pre_start=2630371084 pre_end=2662037415 Calibrating delay using timer specific routine.. 6333.28 BogoMIPS (lpj=31666428) Total of 2 processors activated (12665.99 BogoMIPS). .... Case 2: Some thing (presumably the SMM interrupt?) causing the very low increase in TSC counter for the DELAY_CALIBRATION_TICKS increase in jiffies ... Calibrating delay loop (skipped), value calculated using timer frequency.. 6333.25 BogoMIPS (lpj=31666270) ... calibrate_delay_direct() timer_rate_max=31666483 timer_rate_min=31666074 pre_start=4199536526 pre_end=4231202809 calibrate_delay_direct() timer_rate_max=864348 timer_rate_min=864016 pre_start=2405343672 pre_end=2406207897 calibrate_delay_direct() timer_rate_max=31666483 timer_rate_min=31666179 pre_start=2469540464 pre_end=2501206823 calibrate_delay_direct() timer_rate_max=31666511 timer_rate_min=31666122 pre_start=2564539400 pre_end=2596205712 calibrate_delay_direct() timer_rate_max=31666084 timer_rate_min=31665685 pre_start=2659538782 pre_end=2691204657 calibrate_delay_direct() dropping min bogoMips estimate 1 = 864348 Calibrating delay using timer specific routine.. 6333.27 BogoMIPS (lpj=31666390) Total of 2 processors activated (12666.53 BogoMIPS). ... After 70 boots I saw 2 variations <1% slip through [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: fix straggly printk mess] Signed-off-by: Andrew Worsley <amworsley@gmail.com> Reviewed-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-25 08:39:46 -07:00
KOSAKI Motohiro	de03c72cfc	mm: convert mm->cpu_vm_cpumask into cpumask_var_t cpumask_t is very big struct and cpu_vm_mask is placed wrong position. It might lead to reduce cache hit ratio. This patch has two change. 1) Move the place of cpumask into last of mm_struct. Because usually cpumask is accessed only front bits when the system has cpu-hotplug capability 2) Convert cpu_vm_mask into cpumask_var_t. It may help to reduce memory footprint if cpumask_size() will use nr_cpumask_bits properly in future. In addition, this patch change the name of cpu_vm_mask with cpu_vm_mask_var. It may help to detect out of tree cpu_vm_mask users. This patch has no functional change. [akpm@linux-foundation.org: build fix] [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: David Howells <dhowells@redhat.com> Cc: Koichi Yasutake <yasutake.koichi@jp.panasonic.com> Cc: Hugh Dickins <hughd@google.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-25 08:39:21 -07:00
Linus Torvalds	2bb732cdb4	Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: kbuild: make KBUILD_NOCMDDEP=1 handle empty built-in.o scripts/kallsyms.c: fix potential segfault scripts/gen_initramfs_list.sh: Convert to a /bin/sh script kbuild: Fix GNU make v3.80 compatibility kbuild: Fix passing -Wno-* options to gcc 4.4+ kbuild: move scripts/basic/docproc.c to scripts/docproc.c kbuild: Fix Makefile.asm-generic for um kbuild: Allow to combine multiple W= levels kbuild: Disable -Wunused-but-set-variable for gcc 4.6.0 Fix handling of backlash character in LINUX_COMPILE_BY name kbuild: asm-generic support kbuild: implement several W= levels kbuild: Fix build with binutils <= 2.19 initramfs: Use KBUILD_BUILD_TIMESTAMP for generated entries kbuild: Allow to override LINUX_COMPILE_BY and LINUX_COMPILE_HOST macros kbuild: Drop unused LINUX_COMPILE_TIME and LINUX_COMPILE_DOMAIN macros kbuild: Use the deterministic mode of ar kbuild: Call gzip with -n kbuild: move KALLSYMS_EXTRA_PASS from Kconfig to Makefile Kconfig: improve KALLSYMS_ALL documentation Fix up trivial conflict in Makefile	2011-05-24 13:31:37 -07:00
Linus Torvalds	e98bae7592	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc-next-2.6: (28 commits) sparc32: fix build, fix missing cpu_relax declaration SCHED_TTWU_QUEUE is not longer needed since sparc32 now implements IPI sparc32,leon: Remove unnecessary page_address calls in LEON DMA API. sparc: convert old cpumask API into new one sparc32, sun4d: Implemented SMP IPIs support for SUN4D machines sparc32, sun4m: Implemented SMP IPIs support for SUN4M machines sparc32,leon: Implemented SMP IPIs for LEON CPU sparc32: implement SMP IPIs using the generic functions sparc32,leon: SMP power down implementation sparc32,leon: added some SMP comments sparc: add {read,write}*_be routines sparc32,leon: don't rely on bootloader to mask IRQs sparc32,leon: operate on boot-cpu IRQ controller registers sparc32: always define boot_cpu_id sparc32: removed unused code, implemented by generic code sparc32: avoid build warning at mm/percpu.c:1647 sparc32: always register a PROM based early console sparc32: probe for cpu info only during startup sparc: consolidate show_cpuinfo in cpu.c sparc32,leon: implement genirq CPU affinity ...	2011-05-22 22:06:24 -07:00
Linus Torvalds	281dc5c5ec	Give up on pushing CC_OPTIMIZE_FOR_SIZE I still happen to believe that I$ miss costs are a major thing, but sadly, -Os doesn't seem to be the solution. With or without it, gcc will miss some obvious code size improvements, and with it enabled gcc will sometimes make choices that aren't good even with high I$ miss ratios. For example, with -Os, gcc on x86 will turn a 20-byte constant memcpy into a "rep movsl". While I sincerely hope that x86 CPU's will some day do a good job at that, they certainly don't do it yet, and the cost is higher than a L1 I$ miss would be. Some day I hope we can re-enable this. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-22 14:30:36 -07:00
Daniel Hellstrom	17d9f311ec	SCHED_TTWU_QUEUE is not longer needed since sparc32 now implements IPI Signed-off-by: Daniel Hellstrom <daniel@gaisler.com> Reported-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2011-05-20 13:10:55 -07:00
Linus Torvalds	eb04f2f04e	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (78 commits) Revert "rcu: Decrease memory-barrier usage based on semi-formal proof" net,rcu: convert call_rcu(prl_entry_destroy_rcu) to kfree batman,rcu: convert call_rcu(softif_neigh_free_rcu) to kfree_rcu batman,rcu: convert call_rcu(neigh_node_free_rcu) to kfree() batman,rcu: convert call_rcu(gw_node_free_rcu) to kfree_rcu net,rcu: convert call_rcu(kfree_tid_tx) to kfree_rcu() net,rcu: convert call_rcu(xt_osf_finger_free_rcu) to kfree_rcu() net/mac80211,rcu: convert call_rcu(work_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(wq_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(phonet_device_rcu_free) to kfree_rcu() perf,rcu: convert call_rcu(swevent_hlist_release_rcu) to kfree_rcu() perf,rcu: convert call_rcu(free_ctx) to kfree_rcu() net,rcu: convert call_rcu(__nf_ct_ext_free_rcu) to kfree_rcu() net,rcu: convert call_rcu(net_generic_release) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr6) to kfree_rcu() net,rcu: convert call_rcu(netlbl_unlhsh_free_addr4) to kfree_rcu() security,rcu: convert call_rcu(sel_netif_free) to kfree_rcu() net,rcu: convert call_rcu(xps_dev_maps_release) to kfree_rcu() net,rcu: convert call_rcu(xps_map_release) to kfree_rcu() net,rcu: convert call_rcu(rps_map_release) to kfree_rcu() ...	2011-05-19 18:14:34 -07:00
Linus Torvalds	80fe02b5da	Merge branches 'sched-core-for-linus' and 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (60 commits) sched: Fix and optimise calculation of the weight-inverse sched: Avoid going ahead if ->cpus_allowed is not changed sched, rt: Update rq clock when unthrottling of an otherwise idle CPU sched: Remove unused parameters from sched_fork() and wake_up_new_task() sched: Shorten the construction of the span cpu mask of sched domain sched: Wrap the 'cfs_rq->nr_spread_over' field with CONFIG_SCHED_DEBUG sched: Remove unused 'this_best_prio arg' from balance_tasks() sched: Remove noop in alloc_rt_sched_group() sched: Get rid of lock_depth sched: Remove obsolete comment from scheduler_tick() sched: Fix sched_domain iterations vs. RCU sched: Next buddy hint on sleep and preempt path sched: Make set__buddy() work on non-task entities sched: Remove need_migrate_task() sched: Move the second half of ttwu() to the remote cpu sched: Restructure ttwu() some more sched: Rename ttwu_post_activation() to ttwu_do_wakeup() sched: Remove rq argument from ttwu_stat() sched: Remove rq->lock from the first half of ttwu() sched: Drop rq->lock from sched_exec() ... 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Fix rt_rq runtime leakage bug	2011-05-19 17:41:22 -07:00
Catalin Marinas	9b090f2da8	kmemleak: Initialise kmemleak after debug_objects_mem_init() Kmemleak frees objects via RCU and when CONFIG_DEBUG_OBJECTS_RCU_HEAD is enabled, the RCU callback triggers a call to free_object() in lib/debugobjects.c. Since kmemleak is initialised before debug objects initialisation, it may result in a kernel panic during booting. This patch moves the kmemleak_init() call after debug_objects_mem_init(). Reported-by: Marcin Slusarz <marcin.slusarz@gmail.com> Tested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@kernel.org>	2011-05-19 17:36:37 +01:00
Ingo Molnar	9cb5baba5e	Merge commit 'v2.6.39-rc7' into sched/core	2011-05-12 09:36:18 +02:00
David Rientjes	21a43e397e	slub: Revert "[PARISC] slub: fix panic with DISCONTIGMEM" This reverts commit `4a5fa3590f`, which did not allow SLUB to be used on architectures that use DISCONTIGMEM without compiling NUMA support without CONFIG_BROKEN also set. The slub panic that it was intended to prevent is addressed by `d9b41e0b54` ("[PARISC] set memory ranges in N_NORMAL_MEMORY when onlined") on parisc so there is no further slub issues with such a configuration. The reverts allows SLUB now to be used on such architectures since there haven't been any reports of additional errors. Cc: James Bottomley <James.Bottomley@suse.de> Signed-off-by: David Rientjes <rientjes@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-05-10 17:37:34 -07:00
Paul E. McKenney	27f4d28057	rcu: priority boosting for TREE_PREEMPT_RCU Add priority boosting for TREE_PREEMPT_RCU, similar to that for TINY_PREEMPT_RCU. This is enabled by the default-off RCU_BOOST kernel parameter. The priority to which to boost preempted RCU readers is controlled by the RCU_BOOST_PRIO kernel parameter (defaulting to real-time priority 1) and the time to wait before boosting the readers who are blocking a given grace period is controlled by the RCU_BOOST_DELAY kernel parameter (defaulting to 500 milliseconds). Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Reviewed-by: Josh Triplett <josh@joshtriplett.org>	2011-05-05 23:16:55 -07:00
Linus Torvalds	e8dad69408	Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6 * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/parisc-2.6: [PARISC] slub: fix panic with DISCONTIGMEM [PARISC] set memory ranges in N_NORMAL_MEMORY when onlined	2011-04-27 15:20:33 -07:00
Randy Dunlap	6befe5f69b	init/Kconfig: fix EXPERT menu list The EXPERT menu list was recently broken by the insertion of a kconfig symbol (EMBEDDED) at the beginning of the EXPERT list of kconfig items. Broken by: commit `6a108a14fa` Author: David Rientjes <rientjes@google.com> Date: Thu Jan 20 14:44:16 2011 -0800 kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT Restore the EXPERT menu list -- don't inject a symbol (EMBEDDED) that does not depend on EXPERT into the list. Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: David Rientjes <rientjes@google.com> Cc: Peter Foley <pefoley2@verizon.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-04-26 20:48:37 -07:00
James Bottomley	4a5fa3590f	[PARISC] slub: fix panic with DISCONTIGMEM Slub makes assumptions about page_to_nid() which are violated by DISCONTIGMEM and !NUMA. This violation results in a panic because page_to_nid() can be non-zero for pages in the discontiguous ranges and this leads to a null return by get_node(). The assertion by the maintainer is that DISCONTIGMEM should only be allowed when NUMA is also defined. However, at least six architectures: alpha, ia64, m32r, m68k, mips, parisc violate this. The panic is a regression against slab, so just mark slub broken in the problem configuration to prevent users reporting these panics. Cc: stable@kernel.org Acked-by: David Rientjes <rientjes@google.com> Acked-by: Pekka Enberg <penberg@kernel.org> Signed-off-by: James Bottomley <James.Bottomley@suse.de>	2011-04-22 15:42:46 -05:00
Artem Bityutskiy	1e2795a119	kbuild: move KALLSYMS_EXTRA_PASS from Kconfig to Makefile At the moment we have the CONFIG_KALLSYMS_EXTRA_PASS Kconfig switch, which users can enable or disable while configuring the kernel. This option is then used by 'make' to determine whether an extra kallsyms pass is needed or not. However, this approach is not nice and confusing, and this patch moves CONFIG_KALLSYMS_EXTRA_PASS from Kconfig to Makefile instead. The rationale is below. 1. CONFIG_KALLSYMS_EXTRA_PASS is really about the build time, not run-time. There is no real need for it to be in Kconfig. It is just an additional work-around which should be used only in rare cases, when someone breaks kallsyms, so Kbuild/Makefile is much better place for this option. 2. Grepping CONFIG_KALLSYMS_EXTRA_PASS shows that many defconfigs have it enabled, probably not because they try to work-around a kallsyms bug, but just because the Kconfig help text is confusing and does not really make it clear that this option should not be used unless except when kallsyms is broken. 3. And since many people have CONFIG_KALLSYMS_EXTRA_PASS enabled in their Kconfig, we do might fail to notice kallsyms bugs in time. E.g., many testers use "make allyesconfig" to test builds, which will enable CONFIG_KALLSYMS_EXTRA_PASS and kallsyms breakage will not be noticed. To address that, this patch: 1. Kills CONFIG_KALLSYMS_EXTRA_PASS 2. Changes Makefile so that people can use "make KALLSYMS_EXTRA_PASS=1" to enable the extra pass if needed. Additionally, they may define KALLSYMS_EXTRA_PASS as an environment variable. 3. By default KALLSYMS_EXTRA_PASS is disabled and if kallsyms has issues, "make" should print a warning and suggest using KALLSYMS_EXTRA_PASS Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> [mmarek: Removed make help text, is not necessary] Signed-off-by: Michal Marek <mmarek@suse.cz>	2011-04-15 15:56:15 +02:00
Artem Bityutskiy	71a83ec7da	Kconfig: improve KALLSYMS_ALL documentation Dumb users like myself are not able to grasp from the existing KALLSYMS_ALL documentation that this option is not what they need. Improve the help message and make it clearer that KALLSYMS is enough in the majority of use cases, and KALLSYMS_ALL should really be used very rarely. Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Signed-off-by: Michal Marek <mmarek@suse.cz>	2011-04-15 15:48:01 +02:00
Peter Zijlstra	317f394160	sched: Move the second half of ttwu() to the remote cpu Now that we've removed the rq->lock requirement from the first part of ttwu() and can compute placement without holding any rq->lock, ensure we execute the second half of ttwu() on the actual cpu we want the task to run on. This avoids having to take rq->lock and doing the task enqueue remotely, saving lots on cacheline transfers. As measured using: http://oss.oracle.com/~mason/sembench.c $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor ; do echo performance > $i; done $ echo 4096 32000 64 128 > /proc/sys/kernel/sem $ ./sembench -t 2048 -w 1900 -o 0 unpatched: run time 30 seconds 647278 worker burns per second patched: run time 30 seconds 816715 worker burns per second Reviewed-by: Frank Rowand <frank.rowand@am.sony.com> Cc: Mike Galbraith <efault@gmx.de> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20110405152729.515897185@chello.nl	2011-04-14 08:52:41 +02:00
Lucas De Marchi	25985edced	Fix common misspellings Fixes generated by 'codespell' and manually reviewed. Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>	2011-03-31 11:26:23 -03:00
Serge E. Hallyn	59607db367	userns: add a user_namespace as creator/owner of uts_namespace The expected course of development for user namespaces targeted capabilities is laid out at https://wiki.ubuntu.com/UserNamespace. Goals: - Make it safe for an unprivileged user to unshare namespaces. They will be privileged with respect to the new namespace, but this should only include resources which the unprivileged user already owns. - Provide separate limits and accounting for userids in different namespaces. Status: Currently (as of 2.6.38) you can clone with the CLONE_NEWUSER flag to get a new user namespace if you have the CAP_SYS_ADMIN, CAP_SETUID, and CAP_SETGID capabilities. What this gets you is a whole new set of userids, meaning that user 500 will have a different 'struct user' in your namespace than in other namespaces. So any accounting information stored in struct user will be unique to your namespace. However, throughout the kernel there are checks which - simply check for a capability. Since root in a child namespace has all capabilities, this means that a child namespace is not constrained. - simply compare uid1 == uid2. Since these are the integer uids, uid 500 in namespace 1 will be said to be equal to uid 500 in namespace 2. As a result, the lxc implementation at lxc.sf.net does not use user namespaces. This is actually helpful because it leaves us free to develop user namespaces in such a way that, for some time, user namespaces may be unuseful. Bugs aside, this patchset is supposed to not at all affect systems which are not actively using user namespaces, and only restrict what tasks in child user namespace can do. They begin to limit privilege to a user namespace, so that root in a container cannot kill or ptrace tasks in the parent user namespace, and can only get world access rights to files. Since all files currently belong to the initila user namespace, that means that child user namespaces can only get world access rights to all files. While this temporarily makes user namespaces bad for system containers, it starts to get useful for some sandboxing. I've run the 'runltplite.sh' with and without this patchset and found no difference. This patch: copy_process() handles CLONE_NEWUSER before the rest of the namespaces. So in the case of clone(CLONE_NEWUSER\|CLONE_NEWUTS) the new uts namespace will have the new user namespace as its owner. That is what we want, since we want root in that new userns to be able to have privilege over it. Changelog: Feb 15: don't set uts_ns->user_ns if we didn't create a new uts_ns. Feb 23: Move extern init_user_ns declaration from init/version.c to utsname.h. Signed-off-by: Serge E. Hallyn <serge.hallyn@canonical.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Acked-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: David Howells <dhowells@redhat.com> Cc: James Morris <jmorris@namei.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-23 19:46:59 -07:00
Eric W. Biederman	45a68628d3	pid: remove the child_reaper special case in init/main.c This patchset is a cleanup and a preparation to unshare the pid namespace. These prerequisites prepare for Eric's patchset to give a file descriptor to a namespace and join an existing namespace. This patch: It turns out that the existing assignment in copy_process of the child_reaper can handle the initial assignment of child_reaper we just need to generalize the test in kernel/fork.c Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Serge E. Hallyn <serge@hallyn.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-23 19:46:57 -07:00
Davidlohr Bueso	ea611b2699	init: return proper error code in do_mounts_rd() In do_mounts_rd() if memory cannot be allocated, return -ENOMEM. Signed-off-by: Davidlohr Bueso <dave@gnu.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:15 -07:00
Phil Carmody	b1b5f65e53	calibrate: retry with wider bounds when converge seems to fail Systems with unmaskable interrupts such as SMIs may massively underestimate loops_per_jiffy, and fail to converge anywhere near the real value. A case seen on x86_64 was an initial estimate of 256<<12, which converged to 511<<12 where the real value should have been over 630<<12. This admitedly requires bypassing the TSC calibration (lpj_fine), and a failure to settle in the direct calibration too, but is physically possible. This failure does not depend on my previous calibration optimisation, but by luck is easy to fix with the optimisation in place with a trivial retry loop. In the context of the optimised converging method, as we can no longer trust the starting estimate, enlarge the search bounds exponentially so that the number of retries is logarithmically bounded. [akpm@linux-foundation.org: mention x86_64 SMIs in comment] Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Tested-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:12 -07:00
Phil Carmody	191e56880a	calibrate: home in on correct lpj value more quickly Binary chop with a jiffy-resync on each step to find an upper bound is slow, so just race in a tight-ish loop to find an underestimate. If done with lots of individual steps, sometimes several hundreds of iterations would be required, which would impose a significant overhead, and make the initial estimate very low. By taking slowly increasing steps there will be less overhead. E.g. an x86_64 2.67GHz could have fitted in 613 individual small delays, but in reality should have been able to fit in a single delay 644 times longer, so underestimated by 31 steps. To reach the equivalent of 644 small delays with the accelerating scheme now requires about 130 iterations, so has <1/4th of the overhead, and can therefore be expected to underestimate by only 7 steps. As now we have a better initial estimate we can binary chop over a smaller range. With the loop overhead in the initial estimate kept low, and the step sizes moderate, we won't have under-estimated by much, so chose as tight a range as we can. Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Tested-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:11 -07:00
Phil Carmody	71c696b1d0	calibrate: extract fall-back calculation into own helper The motivation for this patch series is that currently our OMAP calibrates itself using the trial-and-error binary chop fallback that some other architectures no longer need to perform. This is a lengthy process, taking 0.2s in an environment where boot time is of great interest. Patch 2/4 has two optimisations. Firstly, it replaces the initial repeated- doubling to find the relevant power of 2 with a tight loop that just does as much as it can in a jiffy. Secondly, it doesn't binary chop over an entire power of 2 range, it choses a much smaller range based on how much it squeezed in, and failed to squeeze in, during the first stage. Both are significant optimisations, and bring our calibration down from 23 jiffies to 5, and, in the process, often arrive at a more accurate lpj value. The 'bands' and 'sub-logarithmic' growth may look over-engineered, but they only cost a small level of inaccuracy in the initial guess (for all architectures) in order to avoid the very large inaccuracies that appeared during testing (on x86_64 architectures, and presumably others with less metronomic operation). Note that due to the existence of the TSC and other timers, the x86_64 will not typically use this fallback routine, but I wanted to code defensively, able to cope with all kinds of processor behaviours and kernel command line options. Patch 3/4 is an additional trap for the nightmare scenario where the initial estimate is very inaccurate, possibly due to things like SMIs. It simply retries with a larger bound. Stephen said: I tried this patch set out on an MSM7630. : : Before: : : Calibrating delay loop... 681.57 BogoMIPS (lpj=3407872) : : After: : : Calibrating delay loop... 680.75 BogoMIPS (lpj=3403776) : : But the really good news is calibration time dropped from ~247ms to ~56ms. : Sadly we won't be able to benefit from this should my udelay patches make : it into ARM because we would be using calibrate_delay_direct() instead (at : least on machines who choose to). Can we somehow reapply the logic behind : this to calibrate_delay_direct()? That would be even better, but this is : definitely a boot time improvement. : : Or maybe we could just replace calibrate_delay_direct() with this fallback : calculation? If __delay() is a thin wrapper around read_current_timer() : it should work just as well (plus patch 3 makes it handle SMIs). I'll try : that out. This patch: ... so that it can be modified more clinically. This is almost entirely cosmetic. The only change to the operation is that the global variable is only set once after the estimation is completed, rather than taking on all the intermediate values. However, there are no readers of that variable, so this change is unimportant. Signed-off-by: Phil Carmody <ext-phil.2.carmody@nokia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Tested-by: Stephen Boyd <sboyd@codeaurora.org> Cc: Greg KH <greg@kroah.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:11 -07:00
Amerigo Wang	34db18a054	smp: move smp setup functions to kernel/smp.c Move setup_nr_cpu_ids(), smp_init() and some other SMP boot parameter setup functions from init/main.c to kenrel/smp.c, saves some #ifdef CONFIG_SMP. Signed-off-by: WANG Cong <amwang@redhat.com> Cc: Rakib Mullick <rakib.mullick@gmail.com> Cc: David Howells <dhowells@redhat.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Tejun Heo <tj@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:11 -07:00
Mandeep Singh Baines	80cdc6dae7	fs: use appropriate printk priority levels printk()s without a priority level default to KERN_WARNING. To reduce noise at KERN_WARNING, this patch set the priority level appriopriately for unleveled printks()s. This should be useful to folks that look at dmesg warnings closely. Signed-off-by: Mandeep Singh Baines <msb@chromium.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-03-22 17:44:10 -07:00
Linus Torvalds	e16b396ce3	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (47 commits) doc: CONFIG_UNEVICTABLE_LRU doesn't exist anymore Update cpuset info & webiste for cgroups dcdbas: force SMI to happen when expected arch/arm/Kconfig: remove one to many l's in the word. asm-generic/user.h: Fix spelling in comment drm: fix printk typo 'sracth' Remove one to many n's in a word Documentation/filesystems/romfs.txt: fixing link to genromfs drivers:scsi Change printk typo initate -> initiate serial, pch uart: Remove duplicate inclusion of linux/pci.h header fs/eventpoll.c: fix spelling mm: Fix out-of-date comments which refers non-existent functions drm: Fix printk typo 'failled' coh901318.c: Change initate to initiate. mbox-db5500.c Change initate to initiate. edac: correct i82975x error-info reported edac: correct i82975x mci initialisation edac: correct commented info fs: update comments to point correct document target: remove duplicate include of target/target_core_device.h from drivers/target/target_core_hba.c ... Trivial conflict in fs/eventpoll.c (spelling vs addition)	2011-03-18 10:37:40 -07:00
Linus Torvalds	f74b944419	Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: BKL: That's all, folks fs/locks.c: Remove stale FIXME left over from BKL conversion ipx: remove the BKL appletalk: remove the BKL x25: remove the BKL ufs: remove the BKL hpfs: remove the BKL drivers: remove extraneous includes of smp_lock.h tracing: don't trace the BKL adfs: remove the big kernel lock	2011-03-16 17:21:00 -07:00
Linus Torvalds	a5e6b135bd	Merge branch 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * 'driver-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (50 commits) printk: do not mangle valid userspace syslog prefixes efivars: Add Documentation efivars: Expose efivars functionality to external drivers. efivars: Parameterize operations. efivars: Split out variable registration efivars: parameterize efivars efivars: Make efivars bin_attributes dynamic efivars: move efivars globals into struct efivars drivers:misc: ti-st: fix debugging code kref: Fix typo in kref documentation UIO: add PRUSS UIO driver support Fix spelling mistakes in Documentation/zh_CN/SubmittingPatches firmware: Fix unaligned memory accesses in dmi-sysfs firmware: Add documentation for /sys/firmware/dmi firmware: Expose DMI type 15 System Event Log firmware: Break out system_event_log in dmi-sysfs firmware: Basic dmi-sysfs support firmware: Add DMI entry types to the headers Driver core: convert platform_{get,set}_drvdata to static inline functions Translate linux-2.6/Documentation/magic-number.txt into Chinese ...	2011-03-16 15:05:40 -07:00
Linus Torvalds	a926021cb1	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (184 commits) perf probe: Clean up probe_point_lazy_walker() return value tracing: Fix irqoff selftest expanding max buffer tracing: Align 4 byte ints together in struct tracer tracing: Export trace_set_clr_event() tracing: Explain about unstable clock on resume with ring buffer warning ftrace/graph: Trace function entry before updating index ftrace: Add .ref.text as one of the safe areas to trace tracing: Adjust conditional expression latency formatting. tracing: Fix event alignment: skb:kfree_skb tracing: Fix event alignment: mce:mce_record tracing: Fix event alignment: kvm:kvm_hv_hypercall tracing: Fix event alignment: module:module_request tracing: Fix event alignment: ftrace:context_switch and ftrace:wakeup tracing: Remove lock_depth from event entry perf header: Stop using 'self' perf session: Use evlist/evsel for managing perf.data attributes perf top: Don't let events to eat up whole header line perf top: Fix events overflow in top command ring-buffer: Remove unused #include <linux/trace_irq.h> tracing: Add an 'overwrite' trace_option. ...	2011-03-15 18:31:30 -07:00
Aneesh Kumar K.V	990d6c2d7a	vfs: Add name to file handle conversion support The syscall also return mount id which can be used to lookup file system specific information such as uuid in /proc/<pid>/mountinfo Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2011-03-15 02:21:37 -04:00
Arnd Bergmann	4ba8216cd9	BKL: That's all, folks This removes the implementation of the big kernel lock, at last. A lot of people have worked on this in the past, I so the credit for this patch should be with everyone who participated in the hunt. The names on the Cc list are the people that were the most active in this, according to the recorded git history, in alphabetical order. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Alan Cox <alan@linux.intel.com> Cc: Alessio Igor Bogani <abogani@texware.it> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Andrew Hendry <andrew.hendry@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Hans Verkuil <hverkuil@xs4all.nl> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Jan Blunck <jblunck@infradead.org> Cc: John Kacur <jkacur@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Oliver Neukum <oliver@neukum.org> Cc: Paul Menage <menage@google.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Trond Myklebust <Trond.Myklebust@netapp.com>	2011-03-05 10:56:00 +01:00
Li Zefan	2d0f25201e	perf cgroup: Fix a typo in kernel config s/specificied/specified Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Stephane Eranian <eranian@google.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4D6F348C.2050804@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-03-04 11:32:51 +01:00
Stephane Eranian	e5d1367f17	perf: Add cgroup support This kernel patch adds the ability to filter monitoring based on container groups (cgroups). This is for use in per-cpu mode only. The cgroup to monitor is passed as a file descriptor in the pid argument to the syscall. The file descriptor must be opened to the cgroup name in the cgroup filesystem. For instance, if the cgroup name is foo and cgroupfs is mounted in /cgroup, then the file descriptor is opened to /cgroup/foo. Cgroup mode is activated by passing PERF_FLAG_PID_CGROUP in the flags argument to the syscall. For instance to measure in cgroup foo on CPU1 assuming cgroupfs is mounted under /cgroup: struct perf_event_attr attr; int cgroup_fd, fd; cgroup_fd = open("/cgroup/foo", O_RDONLY); fd = perf_event_open(&attr, cgroup_fd, 1, -1, PERF_FLAG_PID_CGROUP); close(cgroup_fd); Signed-off-by: Stephane Eranian <eranian@google.com> [ added perf_cgroup_{exit,attach} ] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <4d590250.114ddf0a.689e.4482@mx.google.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-02-16 13:30:48 +01:00
Jiri Kosina	0a9d59a246	Merge branch 'master' into for-next	2011-02-15 10:24:31 +01:00
Tim Deegan	70a062286b	fix jiffy calculations in calibrate_delay_direct to handle overflow Fixes a hang when booting as dom0 under Xen, when jiffies can be quite large by the time the kernel init gets this far. Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> [jbeulich@novell.com: !time_after() -> time_before_eq() as suggested by Jiri Slaby] Signed-off-by: Jan Beulich <jbeulich@novell.com> Cc: Jiri Slaby <jslaby@suse.cz> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-02-10 11:00:09 -08:00
Ferenc Wagner	5d6a4ea576	sysfs: Capitalize description of SYSFS_DEPRECATED{_V2} options Signed-off-by: Ferenc Wagner <wferi@niif.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2011-02-03 16:36:40 -08:00
Linus Torvalds	2b1caf6ed7	Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.c lockdep: Move early boot local IRQ enable/disable status to init/main.c	2011-01-20 18:30:37 -08:00
David Rientjes	6a108a14fa	kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option is used to configure any non-standard kernel with a much larger scope than only small devices. This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes references to the option throughout the kernel. A new CONFIG_EMBEDDED option is added that automatically selects CONFIG_EXPERT when enabled and can be used in the future to isolate options that should only be considered for embedded systems (RISC architectures, SLOB, etc). Calling the option "EXPERT" more accurately represents its intention: only expert users who understand the impact of the configuration changes they are making should enable it. Reviewed-by: Ingo Molnar <mingo@elte.hu> Acked-by: David Woodhouse <david.woodhouse@intel.com> Signed-off-by: David Rientjes <rientjes@google.com> Cc: Greg KH <gregkh@suse.de> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jens Axboe <axboe@kernel.dk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Robin Holt <holt@sgi.com> Cc: <linux-arch@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-20 17:02:05 -08:00
Tejun Heo	2ce802f62b	lockdep: Move early boot local IRQ enable/disable status to init/main.c During early boot, local IRQ is disabled until IRQ subsystem is properly initialized. During this time, no one should enable local IRQ and some operations which usually are not allowed with IRQ disabled, e.g. operations which might sleep or require communications with other processors, are allowed. lockdep tracked this with early_boot_irqs_off/on() callbacks. As other subsystems need this information too, move it to init/main.c and make it generally available. While at it, toggle the boolean to early_boot_irqs_disabled instead of enabled so that it can be initialized with %false and %true indicates the exceptional condition. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Pekka Enberg <penberg@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <20110120110635.GB6036@htj.dyndns.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-20 13:32:33 +01:00
Michael Witten	c5e0591ae3	Kconfig: BLK_THROTTLE -> BLK_DEV_THROTTLING It would seem that `CONFIG_BLK_THROTTLE' doesn't exist, as it is only referenced in the documentation for `CONFIG_BLK_CGROUP'. The only other choice is `CONFIG_BLK_DEV_THROTTLING': $ git grep --cached THROTTL -- \*Kconfig block/Kconfig:config BLK_DEV_THROTTLING init/Kconfig: CONFIG_BLK_THROTTLE=y. Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-01-17 11:15:04 +01:00
Michael Witten	79e2e7597d	Kconfig: Typo: seti -> set Also, I introduced some punctuation to facilitate reading. Signed-off-by: Michael Witten <mfwitten@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2011-01-17 11:13:55 +01:00
Linus Torvalds	f9ee7f60d6	Merge branches 'core-fixes-for-linus', 'x86-fixes-for-linus', 'timers-fixes-for-linus' and 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: avoid pointless blocked-task warnings rcu: demote SRCU_SYNCHRONIZE_DELAY from kernel-parameter status rtmutex: Fix comment about why new_owner can be NULL in wake_futex_pi() * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, olpc: Add missing Kconfig dependencies x86, mrst: Set correct APB timer IRQ affinity for secondary cpu x86: tsc: Fix calibration refinement conditionals to avoid divide by zero x86, ia64, acpi: Clean up x86-ism in drivers/acpi/numa.c * 'timers-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timekeeping: Make local variables static time: Rename misnamed minsec argument of clocks_calc_mult_shift() * 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: tracing: Remove syscall_exit_fields tracing: Only process module tracepoints once perf record: Add "nodelay" mode, disabled by default perf sched: Fix list of events, dropping unsupported ':r' modifier Revert "perf tools: Emit clearer message for sys_perf_event_open ENOENT return" perf top: Fix annotate segv perf evsel: Fix order of event list deletion	2011-01-15 12:45:00 -08:00
Ingo Molnar	1161ec9449	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/urgent	2011-01-14 15:30:16 +01:00
Paul E. McKenney	c072a388d5	rcu: demote SRCU_SYNCHRONIZE_DELAY from kernel-parameter status Because the adaptive synchronize_srcu_expedited() approach has worked very well in testing, remove the kernel parameter and replace it by a C-preprocessor macro. If someone finds problems with this approach, a more complex and aggressively adaptive approach might be required. Longer term, SRCU will be merged with the other RCU implementations, at which point synchronize_srcu_expedited() will be event driven, just as synchronize_sched_expedited() currently is. At that point, there will be no need for this adaptive approach. Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2011-01-14 04:56:49 -08:00
Linus Torvalds	008d23e485	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (43 commits) Documentation/trace/events.txt: Remove obsolete sched_signal_send. writeback: fix global_dirty_limits comment runtime -> real-time ppc: fix comment typo singal -> signal drivers: fix comment typo diable -> disable. m68k: fix comment typo diable -> disable. wireless: comment typo fix diable -> disable. media: comment typo fix diable -> disable. remove doc for obsolete dynamic-printk kernel-parameter remove extraneous 'is' from Documentation/iostats.txt Fix spelling milisec -> ms in snd_ps3 module parameter description Fix spelling mistakes in comments Revert conflicting V4L changes i7core_edac: fix typos in comments mm/rmap.c: fix comment sound, ca0106: Fix assignment to 'channel'. hrtimer: fix a typo in comment init/Kconfig: fix typo anon_inodes: fix wrong function name in comment fix comment typos concerning "consistent" poll: fix a typo in comment ... Fix up trivial conflicts in: - drivers/net/wireless/iwlwifi/iwl-core.c (moved to iwl-legacy.c) - fs/ext4/ext4.h Also fix missed 'diabled' typo in drivers/net/bnx2x/bnx2x.h while at it.	2011-01-13 10:05:56 -08:00
Lasse Collin	3ebe12439b	decompressors: add boot-time XZ support This implements the API defined in <linux/decompress/generic.h> which is used for kernel, initramfs, and initrd decompression. This patch together with the first patch is enough for XZ-compressed initramfs and initrd; XZ-compressed kernel will need arch-specific changes. The buffering requirements described in decompress_unxz.c are stricter than with gzip, so the relevant changes should be done to the arch-specific code when adding support for XZ-compressed kernel. Similarly, the heap size in arch-specific pre-boot code may need to be increased (30 KiB is enough). The XZ decompressor needs memmove(), memeq() (memcmp() == 0), and memzero() (memset(ptr, 0, size)), which aren't available in all arch-specific pre-boot environments. I'm including simple versions in decompress_unxz.c, but a cleaner solution would naturally be nicer. Signed-off-by: Lasse Collin <lasse.collin@tukaani.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Alain Knaff <alain@knaff.lu> Cc: Albin Tonnerre <albin.tonnerre@free-electrons.com> Cc: Phillip Lougher <phillip@lougher.demon.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-13 08:03:25 -08:00
Linus Torvalds	23d69b09b7	Merge branch 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq * 'for-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (33 commits) usb: don't use flush_scheduled_work() speedtch: don't abuse struct delayed_work media/video: don't use flush_scheduled_work() media/video: explicitly flush request_module work ioc4: use static work_struct for ioc4_load_modules() init: don't call flush_scheduled_work() from do_initcalls() s390: don't use flush_scheduled_work() rtc: don't use flush_scheduled_work() mmc: update workqueue usages mfd: update workqueue usages dvb: don't use flush_scheduled_work() leds-wm8350: don't use flush_scheduled_work() mISDN: don't use flush_scheduled_work() macintosh/ams: don't use flush_scheduled_work() vmwgfx: don't use flush_scheduled_work() tpm: don't use flush_scheduled_work() sonypi: don't use flush_scheduled_work() hvsi: don't use flush_scheduled_work() xen: don't use flush_scheduled_work() gdrom: don't use flush_scheduled_work() ... Fixed up trivial conflict in drivers/media/video/bt8xx/bttv-input.c as per Tejun.	2011-01-07 16:58:04 -08:00
Linus Torvalds	65b2074f84	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (30 commits) sched: Change wait_for_completion_*_timeout() to return a signed long sched, autogroup: Fix reference leak sched, autogroup: Fix potential access to freed memory sched: Remove redundant CONFIG_CGROUP_SCHED ifdef sched: Fix interactivity bug by charging unaccounted run-time on entity re-weight sched: Move periodic share updates to entity_tick() printk: Use this_cpu_{read\|write} api on printk_pending sched: Make pushable_tasks CONFIG_SMP dependant sched: Add 'autogroup' scheduling feature: automated per session task groups sched: Fix unregister_fair_sched_group() sched: Remove unused argument dest_cpu to migrate_task() mutexes, sched: Introduce arch_mutex_cpu_relax() sched: Add some clock info to sched_debug cpu: Remove incorrect BUG_ON cpu: Remove unused variable sched: Fix UP build breakage sched: Make task dump print all 15 chars of proc comm sched: Update tg->shares after cpu.shares write sched: Allow update_cfs_load() to update global load sched: Implement demand based update_cfs_load() ...	2011-01-06 10:23:33 -08:00
Linus Torvalds	28d9bfc37c	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (146 commits) tools, perf: Documentation for the power events API perf: Add calls to suspend trace point perf script: Make some lists static perf script: Use the default lost event handler perf session: Warn about errors when processing pipe events too perf tools: Fix perf_event.h header usage perf test: Clarify some error reports in the open syscall test x86, NMI: Add touch_nmi_watchdog to io_check_error delay x86: Avoid calling arch_trigger_all_cpu_backtrace() at the same time x86: Only call smp_processor_id in non-preempt cases perf timechart: Adjust perf timechart to the new power events perf: Clean up power events by introducing new, more generic ones perf: Do not export power_frequency, but power_start event perf test: Add test for counting open syscalls perf evsel: Auto allocate resources needed for some methods perf evsel: Use {cpu,thread}_map to shorten list of parameters perf tools: Refactor all_tids to hold nr and the map perf tools: Refactor cpumap to hold nr and the map perf evsel: Introduce per cpu and per thread open helpers perf evsel: Steal the counter reading routines from stat ...	2011-01-06 10:17:26 -08:00
Linus Torvalds	2af49b6058	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: rcu: remove unused __list_for_each_rcu() macro rculist: fix borked __list_for_each_rcu() macro rcu: reduce __call_rcu()-induced contention on rcu_node structures rcu: limit rcu_node leaf-level fanout rcu: fine-tune grace-period begin/end checks rcu: Keep gpnum and completed fields synchronized rcu: Stop chasing QS if another CPU did it for us rcu: increase synchronize_sched_expedited() batching rcu: Make synchronize_srcu_expedited() fast if running readers rcu: fix race condition in synchronize_sched_expedited() rcu: update documentation/comments for Lai's adoption patch rcu,cleanup: simplify the code when cpu is dying rcu,cleanup: move synchronize_sched_expedited() out of sched.c rcu: get rid of obsolete "classic" names in TREE_RCU tracing rcu: Distinguish between boosting and boosted rcu: document TINY_RCU and TINY_PREEMPT_RCU tracing. rcu: add tracing for TINY_RCU and TINY_PREEMPT_RCU rcu: priority boosting for TINY_PREEMPT_RCU rcu: move TINY_RCU from softirq to kthread rcu: add priority-inversion testing to rcutorture	2011-01-06 10:06:26 -08:00
Ingo Molnar	aef1b9cef7	Merge commit 'v2.6.37' into perf/core Merge reason: Add the final .37 tree. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-05 14:22:10 +01:00
Ingo Molnar	27066fd484	Merge commit 'v2.6.37' into sched/core Merge reason: Merge the final .37 tree. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2011-01-05 14:14:46 +01:00
Jan Beulich	a1cf11d8f6	name_to_dev_t() must not call __init code The function can't be __init itself (being called from some sysfs handler), and hence none of the functions it calls can be either. Signed-off-by: Jan Beulich <jbeulich@novell.com> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2011-01-03 11:48:11 -08:00
Tejun Heo	ee4569a3a7	init: don't call flush_scheduled_work() from do_initcalls() The call to flush_scheduled_work() in do_initcalls() is there to make sure all works queued to system_wq by initcalls finish before the init sections are dropped. However, the call doesn't make much sense at this point - there already are multiple different workqueues and different subsystems are free to create and use their own. Ordering requirements are and should be expressed explicitly. Drop the call to prepare for the deprecation and removal of flush_scheduled_work(). Andrew suggested adding sanity check where the workqueue code checks whether any pending or running work has the work function in the init text section. However, checking this for running works requires the worker to keep track of the current function being executed, and checking only the pending works will miss most cases. As a violation will almost always be caught by the usual page fault mechanism, I don't think it would be worthwhile to make the workqueue code track extra state just for this. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org>	2010-12-24 16:14:20 +01:00
Ingo Molnar	394f4528c5	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-2.6-rcu into core/rcu	2010-12-23 12:57:04 +01:00
Jim Cromie	43d547f9ce	init/Kconfig: fix typo Signed-off-by: Jim Cromie <jim.cromie@gmail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-12-22 18:58:36 +01:00
Peter Zijlstra	9f58a205c6	init: Initialized IDR earlier perf_event_init() wants to start using IDR trees, its needs in turn are satisfied by mm_init(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20101117222056.206992649@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-16 11:36:43 +01:00
Peter Zijlstra	24a24bb6ff	perf: Move perf_event_init() into main.c Currently we call perf_event_init() from sched_init(). In order to make it more obvious move it to the cannnonical location. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20101117222056.093629821@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-16 11:36:42 +01:00
Ingo Molnar	8e9255e6a2	Merge branch 'linus' into sched/core Merge reason: we want to queue up dependent cleanup Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-08 20:15:29 +01:00
Ingo Molnar	10a18d7dc0	Merge commit 'v2.6.37-rc5' into perf/core Merge reason: Pick up the latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-12-07 07:49:51 +01:00
Mike Galbraith	5091faa449	sched: Add 'autogroup' scheduling feature: automated per session task groups A recurring complaint from CFS users is that parallel kbuild has a negative impact on desktop interactivity. This patch implements an idea from Linus, to automatically create task groups. Currently, only per session autogroups are implemented, but the patch leaves the way open for enhancement. Implementation: each task's signal struct contains an inherited pointer to a refcounted autogroup struct containing a task group pointer, the default for all tasks pointing to the init_task_group. When a task calls setsid(), a new task group is created, the process is moved into the new task group, and a reference to the preveious task group is dropped. Child processes inherit this task group thereafter, and increase it's refcount. When the last thread of a process exits, the process's reference is dropped, such that when the last process referencing an autogroup exits, the autogroup is destroyed. At runqueue selection time, IFF a task has no cgroup assignment, its current autogroup is used. Autogroup bandwidth is controllable via setting it's nice level through the proc filesystem: cat /proc/<pid>/autogroup Displays the task's group and the group's nice level. echo <nice level> > /proc/<pid>/autogroup Sets the task group's shares to the weight of nice <level> task. Setting nice level is rate limited for !admin users due to the abuse risk of task group locking. The feature is enabled from boot by default if CONFIG_SCHED_AUTOGROUP=y is selected, but can be disabled via the boot option noautogroup, and can also be turned on/off on the fly via: echo [01] > /proc/sys/kernel/sched_autogroup_enabled ... which will automatically move tasks to/from the root task group. Signed-off-by: Mike Galbraith <efault@gmx.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Paul Turner <pjt@google.com> Cc: Oleg Nesterov <oleg@redhat.com> [ Removed the task_group_path() debug code, and fixed !EVENTFD build failure. ] Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <1290281700.28711.9.camel@maggy.simson.net> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-30 16:03:35 +01:00
Paul E. McKenney	46fdb0937f	rcu: Make synchronize_srcu_expedited() fast if running readers The synchronize_srcu_expedited() function is currently quick if there are no active readers, but will delay a full jiffy if there are any. If these readers leave their SRCU read-side critical sections quickly, this is way too long to wait. So this commit first waits ten microseconds, and only then falls back to jiffy-at-a-time waiting. Reported-by: Avi Kivity <avi@redhat.com> Reported-by: Marcelo Tosatti <mtosatti@redhat.com> Tested-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-11-29 22:02:40 -08:00
Paul E. McKenney	9e571a82f0	rcu: add tracing for TINY_RCU and TINY_PREEMPT_RCU Add tracing for the tiny RCU implementations, including statistics on boosting in the case of TINY_PREEMPT_RCU and RCU_BOOST. Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-11-29 22:01:55 -08:00
Paul E. McKenney	24278d1483	rcu: priority boosting for TINY_PREEMPT_RCU Add priority boosting, but only for TINY_PREEMPT_RCU. This is enabled by the default-off RCU_BOOST kernel parameter. The priority to which to boost preempted RCU readers is controlled by the RCU_BOOST_PRIO kernel parameter (defaulting to real-time priority 1) and the time to wait before boosting the readers blocking a given grace period is controlled by the RCU_BOOST_DELAY kernel parameter (defaulting to 500 milliseconds). Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-11-29 22:01:54 -08:00
Peter Zijlstra	004417a6d4	perf, arch: Cleanup perf-pmu init vs lockup-detector The perf hardware pmu got initialized at various points in the boot, some before early_initcall() some after (notably arch_initcall). The problem is that the NMI lockup detector is ran from early_initcall() and expects the hardware pmu to be present. Sanitize this by moving all architecture hardware pmu implementations to initialize at early_initcall() and move the lockup detector to an explicit initcall right after that. Cc: paulus <paulus@samba.org> Cc: davem <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1290707759.2145.119.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-11-26 15:14:56 +01:00
Michal Hocko	a42c390cfa	cgroups: make swap accounting default behavior configurable Swap accounting can be configured by CONFIG_CGROUP_MEM_RES_CTLR_SWAP configuration option and then it is turned on by default. There is a boot option (noswapaccount) which can disable this feature. This makes it hard for distributors to enable the configuration option as this feature leads to a bigger memory consumption and this is a no-go for general purpose distribution kernel. On the other hand swap accounting may be very usuful for some workloads. This patch adds a new configuration option which controls the default behavior (CGROUP_MEM_RES_CTLR_SWAP_ENABLED). If the option is selected then the feature is turned on by default. It also adds a new boot parameter swapaccount[=1\|0] which enhances the original noswapaccount parameter semantic by means of enable/disable logic (defaults to 1 if no value is provided to be still consistent with noswapaccount). The default behavior is unchanged (if CONFIG_CGROUP_MEM_RES_CTLR_SWAP is enabled then CONFIG_CGROUP_MEM_RES_CTLR_SWAP_ENABLED is enabled as well) Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-25 06:50:45 +09:00
Arnd Bergmann	451a3c24b0	BKL: remove extraneous #include <smp_lock.h> The big kernel lock has been removed from all these files at some point, leaving only the #include. Remove this too as a cleanup. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-11-17 08:59:32 -08:00
Linus Torvalds	c9e2a72ff1	Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6 * 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild-2.6: initramfs: Fix build break on symbol-prefixed archs initramfs: fix initramfs size calculation initramfs: generalize initramfs_data.xxx.S variants scripts/kallsyms: Enable error messages while hush up unnecessary warnings scripts/setlocalversion: update comment kbuild: Use a single clean rule for kernel and external modules kbuild: Do not run make clean in $(srctree) scripts/mod/modpost.c: fix commentary accordingly to last changes kbuild: Really don't clean bounds.h and asm-offsets.h	2010-10-28 15:13:55 -07:00
Daniel Lezcano	7af37bec41	namespaces Kconfig: move namespace menu location after the cgroup We have the namespaces as a menuconfig like the cgroup. The cgroup and the namespace are two base bricks for the containers. It is more logical to put the namespace menu right after the cgroup menu. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:17 -07:00
Daniel Lezcano	eef691b36e	namespaces Kconfig: remove the cgroup device whitelist experimental tag This subsystem is merged since a long time now, I think we can consider it mature enough. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:17 -07:00
Daniel Lezcano	79ae9c29e4	namespaces Kconfig: remove pointless cgroup dependency The different cgroup subsystems are under the cgroup submenu. The dependency between the cgroups and the menu subsystems is pointless. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Acked-by: Li Zefan <lizf@cn.fujitsu.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:16 -07:00
Daniel Lezcano	8dd2a82c29	namespaces Kconfig: make namespace a submenu Make the namespaces config option a submenu. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: "Serge E. Hallyn" <serue@us.ibm.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Paul Menage <menage@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:16 -07:00
Daniel Lezcano	17a6d4411a	namespaces: default all the namespaces to 'yes' when CONFIG_NAMESPACES is selected As the different namespaces depend on 'CONFIG_NAMESPACES', it is logical to enable all the namespaces when we enable NAMESPACES. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: David Miller <davem@davemloft.net> Acked-By: Matt Helsley <matthltc@us.ibm.com> Acked-by: Serge Hallyn <serge.hallyn@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:16 -07:00
Daniel Lezcano	9bd38c2cda	namespaces: remove pid_ns and net_ns experimental status The pid namespace is in the kernel since 2.6.27 and the net_ns since 2.6.29. They are enabled in the distro by default and used by userspace component. They are mature enough to remove the 'experimental' label. Signed-off-by: Daniel Lezcano <daniel.lezcano@free.fr> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: David Miller <davem@davemloft.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-27 18:03:16 -07:00
Namhyung Kim	562f5e638d	init: mark __user address space on string literals When calling syscall service routines in kernel, some of arguments should be user pointers but were missing __user markup on string literals. Add it. Removes some sparse warnings. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Cc: Phillip Lougher <phillip@lougher.demon.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-10-26 16:52:15 -07:00
Linus Torvalds	74eb94b218	Merge branch 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6 * 'nfs-for-2.6.37' of git://git.linux-nfs.org/projects/trondmy/nfs-2.6: (67 commits) SUNRPC: Cleanup duplicate assignment in rpcauth_refreshcred nfs: fix unchecked value Ask for time_delta during fsinfo probe Revalidate caches on lock SUNRPC: After calling xprt_release(), we must restart from call_reserve NFSv4: Fix up the 'dircount' hint in encode_readdir NFSv4: Clean up nfs4_decode_dirent NFSv4: nfs4_decode_dirent must clear entry->fattr->valid NFSv4: Fix a regression in decode_getfattr NFSv4: Fix up decode_attr_filehandle() to handle the case of empty fh pointer NFS: Ensure we check all allocation return values in new readdir code NFS: Readdir plus in v4 NFS: introduce generic decode_getattr function NFS: check xdr_decode for errors NFS: nfs_readdir_filler catch all errors NFS: readdir with vmapped pages NFS: remove page size checking code NFS: decode_dirent should use an xdr_stream SUNRPC: Add a helper function xdr_inline_peek NFS: remove readdir plus limit ...	2010-10-25 13:48:29 -07:00
Linus Torvalds	229aebb873	Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits) Update broken web addresses in arch directory. Update broken web addresses in the kernel. Revert "drivers/usb: Remove unnecessary return's from void functions" for musb gadget Revert "Fix typo: configuation => configuration" partially ida: document IDA_BITMAP_LONGS calculation ext2: fix a typo on comment in ext2/inode.c drivers/scsi: Remove unnecessary casts of private_data drivers/s390: Remove unnecessary casts of private_data net/sunrpc/rpc_pipe.c: Remove unnecessary casts of private_data drivers/infiniband: Remove unnecessary casts of private_data drivers/gpu/drm: Remove unnecessary casts of private_data kernel/pm_qos_params.c: Remove unnecessary casts of private_data fs/ecryptfs: Remove unnecessary casts of private_data fs/seq_file.c: Remove unnecessary casts of private_data arm: uengine.c: remove C99 comments arm: scoop.c: remove C99 comments Fix typo configue => configure in comments Fix typo: configuation => configuration Fix typo interrest[ing\|ed] => interest[ing\|ed] Fix various typos of valid in comments ... Fix up trivial conflicts in: drivers/char/ipmi/ipmi_si_intf.c drivers/usb/gadget/rndis.c net/irda/irnet/irnet_ppp.c	2010-10-24 13:41:39 -07:00
Linus Torvalds	b9da057105	Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: (31 commits) driver core: Display error codes when class suspend fails Driver core: Add section count to memory_block struct Driver core: Add mutex for adding/removing memory blocks Driver core: Move find_memory_block routine hpilo: Despecificate driver from iLO generation driver core: Convert link_mem_sections to use find_memory_block_hinted. driver core: Introduce find_memory_block_hinted which utilizes kset_find_obj_hinted. kobject: Introduce kset_find_obj_hinted. driver core: fix build for CONFIG_BLOCK not enabled driver-core: base: change to new flag variable sysfs: only access bin file vm_ops with the active lock sysfs: Fail bin file mmap if vma close is implemented. FW_LOADER: fix kconfig dependency warning on HOTPLUG uio: Statically allocate uio_class and use class .dev_attrs. uio: Support 2^MINOR_BITS minors uio: Cleanup irq handling. uio: Don't clear driver data uio: Fix lack of locking in init_uio_class SYSFS: Allow boot time switching between deprecated and modern sysfs layout driver core: remove CONFIG_SYSFS_DEPRECATED_V2 but keep it for block devices ...	2010-10-22 19:36:42 -07:00
Linus Torvalds	e9dd2b6837	Merge branch 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block * 'for-2.6.37/core' of git://git.kernel.dk/linux-2.6-block: (39 commits) cfq-iosched: Fix a gcc 4.5 warning and put some comments block: Turn bvec_k{un,}map_irq() into static inline functions block: fix accounting bug on cross partition merges block: Make the integrity mapped property a bio flag block: Fix double free in blk_integrity_unregister block: Ensure physical block size is unsigned int blkio-throttle: Fix possible multiplication overflow in iops calculations blkio-throttle: limit max iops value to UINT_MAX blkio-throttle: There is no need to convert jiffies to milli seconds blkio-throttle: Fix link failure failure on i386 blkio: Recalculate the throttled bio dispatch time upon throttle limit change blkio: Add root group to td->tg_list blkio: deletion of a cgroup was causes oops blkio: Do not export throttle files if CONFIG_BLK_DEV_THROTTLING=n block: set the bounce_pfn to the actual DMA limit rather than to max memory block: revert bad fix for memory hotplug causing bounces Fix compile error in blk-exec.c for !CONFIG_DETECT_HUNG_TASK block: set the bounce_pfn to the actual DMA limit rather than to max memory block: Prevent hang_check firing during long I/O cfq: improve fsync performance for small files ... Fix up trivial conflicts due to __rcu sparse annotation in include/linux/genhd.h	2010-10-22 17:00:32 -07:00
Linus Torvalds	5704e44d28	Merge branch 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl * 'config' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/bkl: BKL: introduce CONFIG_BKL. dabusb: remove the BKL sunrpc: remove the big kernel lock init/main.c: remove BKL notations blktrace: remove the big kernel lock rtmutex-tester: make it build without BKL dvb-core: kill the big kernel lock dvb/bt8xx: kill the big kernel lock tlclk: remove big kernel lock fix rawctl compat ioctls breakage on amd64 and itanic uml: kill big kernel lock parisc: remove big kernel lock cris: autoconvert trivial BKL users alpha: kill big kernel lock isapnp: BKL removal s390/block: kill the big kernel lock hpet: kill BKL, add compat_ioctl	2010-10-22 10:43:11 -07:00
Andi Kleen	e52eec13cd	SYSFS: Allow boot time switching between deprecated and modern sysfs layout I have some systems which need legacy sysfs due to old tools that are making assumptions that a directory can never be a symlink to another directory, and it's a big hazzle to compile separate kernels for them. This patch turns CONFIG_SYSFS_DEPRECATED into a run time option that can be switched on/off the kernel command line. This way the same binary can be used in both cases with just a option on the command line. The old CONFIG_SYSFS_DEPRECATED_V2 option is still there to set the default. I kept the weird name to not break existing config files. Also the compat code can be still completely disabled by undefining CONFIG_SYSFS_DEPRECATED_SWITCH -- just the optimizer takes care of this now instead of lots of ifdefs. This makes the code look nicer. v2: This is an updated version on top of Kay's patch to only handle the block devices. I tested it on my old systems and that seems to work. Cc: axboe@kernel.dk Signed-off-by: Andi Kleen <ak@linux.intel.com> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-10-22 10:16:43 -07:00
Kay Sievers	39aba963d9	driver core: remove CONFIG_SYSFS_DEPRECATED_V2 but keep it for block devices This patch removes the old CONFIG_SYSFS_DEPRECATED_V2 config option, but it keeps the logic around to handle block devices in the old manner as some people like to run new kernel versions on old (pre 2007/2008) distros. Signed-off-by: Kay Sievers <kay.sievers@vrfy.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: "James E.J. Bottomley" <James.Bottomley@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Randy Dunlap <randy.dunlap@oracle.com> Cc: Tejun Heo <tj@kernel.org> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.de> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>	2010-10-22 10:16:43 -07:00
Linus Torvalds	4a60cfa945	Merge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (96 commits) apic, x86: Use BIOS settings for IBS and MCE threshold interrupt LVT offsets apic, x86: Check if EILVT APIC registers are available (AMD only) x86: ioapic: Call free_irte only if interrupt remapping enabled arm: Use ARCH_IRQ_INIT_FLAGS genirq, ARM: Fix boot on ARM platforms genirq: Fix CONFIG_GENIRQ_NO_DEPRECATED=y build x86: Switch sparse_irq allocations to GFP_KERNEL genirq: Switch sparse_irq allocator to GFP_KERNEL genirq: Make sparse_lock a mutex x86: lguest: Use new irq allocator genirq: Remove the now unused sparse irq leftovers genirq: Sanitize dynamic irq handling genirq: Remove arch_init_chip_data() x86: xen: Sanitise sparse_irq handling x86: Use sane enumeration x86: uv: Clean up the direct access to irq_desc x86: Make io_apic.c local functions static genirq: Remove irq_2_iommu x86: Speed up the irq_remapped check in hot pathes intr_remap: Simplify the code further ... Fix up trivial conflicts in arch/x86/Kconfig	2010-10-21 14:11:46 -07:00
Linus Torvalds	5d70f79b5e	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (163 commits) tracing: Fix compile issue for trace_sched_wakeup.c [S390] hardirq: remove pointless header file includes [IA64] Move local_softirq_pending() definition perf, powerpc: Fix power_pmu_event_init to not use event->ctx ftrace: Remove recursion between recordmcount and scripts/mod/empty jump_label: Add COND_STMT(), reducer wrappery perf: Optimize sw events perf: Use jump_labels to optimize the scheduler hooks jump_label: Add atomic_t interface jump_label: Use more consistent naming perf, hw_breakpoint: Fix crash in hw_breakpoint creation perf: Find task before event alloc perf: Fix task refcount bugs perf: Fix group moving irq_work: Add generic hardirq context callbacks perf_events: Fix transaction recovery in group_sched_in() perf_events: Fix bogus AMD64 generic TLB events perf_events: Fix bogus context time tracking tracing: Remove parent recording in latency tracer graph options tracing: Use one prologue for the preempt irqs off tracer function tracers ...	2010-10-21 12:54:49 -07:00
Arnd Bergmann	6de5bd128d	BKL: introduce CONFIG_BKL. With all the patches we have queued in the BKL removal tree, only a few dozen modules are left that actually rely on the BKL, and even there are lots of low-hanging fruit. We need to decide what to do about them, this patch illustrates one of the options: Every user of the BKL is marked as 'depends on BKL' in Kconfig, and the CONFIG_BKL becomes a user-visible option. If it gets disabled, no BKL using module can be built any more and the BKL code itself is compiled out. The one exception is file locking, which is practically always enabled and does a 'select BKL' instead. This effectively forces CONFIG_BKL to be enabled until we have solved the fs/lockd mess and can apply the patch that removes the BKL from fs/locks.c. Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2010-10-21 15:44:13 +02:00
Namhyung Kim	1fa4f3b57c	init/main.c: remove BKL notations According to commit `5e3d20a68f` (init: Remove the BKL from startup code) these sparse notations should be removed also. Signed-off-by: Namhyung Kim <namhyung@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2010-10-19 11:29:58 +02:00
Peter Zijlstra	e360adbe29	irq_work: Add generic hardirq context callbacks Provide a mechanism that allows running code in IRQ context. It is most useful for NMI code that needs to interact with the rest of the system -- like wakeup a task to drain buffers. Perf currently has such a mechanism, so extract that and provide it as a generic feature, independent of perf so that others may also benefit. The IRQ context callback is generated through self-IPIs where possible, or on architectures like powerpc the decrementer (the built-in timer facility) is set to generate an interrupt immediately. Architectures that don't have anything like this get to do with a callback from the timer tick. These architectures can call irq_work_run() at the tail of any IRQ handlers that might enqueue such work (like the perf IRQ handler) to avoid undue latencies in processing the work. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Kyle McMartin <kyle@mcmartin.ca> Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [ various fixes ] Signed-off-by: Huang Ying <ying.huang@intel.com> LKML-Reference: <1287036094.7768.291.camel@yhuang-dev> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-10-18 19:58:50 +02:00
Thomas Gleixner	154cd387cd	genirq: Remove early_init_irq_lock_class() early_init_irq_lock_class() is called way before anything touches the irq descriptors. In case of SPARSE_IRQ=y this is a NOP operation because the radix tree is empty at this point. For the SPARSE_IRQ=n case it's sufficient to set the lock class in early_init_irq(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Ingo Molnar <mingo@elte.hu>	2010-10-12 16:39:06 +02:00
Thomas Gleixner	d9817ebeee	genirq: Provide Kconfig The generic irq Kconfig options are copied around all archs. Provide a generic Kconfig file which can be included. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Peter Zijlstra <peterz@infradead.org> LKML-Reference: <20100927121843.217333624@linutronix.de> Reviewed-by: H. Peter Anvin <hpa@zytor.com> Reviewed-by: Ingo Molnar <mingo@elte.hu>	2010-10-04 11:01:05 +02:00
Hendrik Brueckner	ffe8018c34	initramfs: fix initramfs size calculation The size of a built-in initramfs is calculated in init/initramfs.c by "__initramfs_end - __initramfs_start". Those symbols are defined in the linker script include/asm-generic/vmlinux.lds.h: #define INIT_RAM_FS \ . = ALIGN(PAGE_SIZE); \ VMLINUX_SYMBOL(__initramfs_start) = .; \ *(.init.ramfs) \ VMLINUX_SYMBOL(__initramfs_end) = .; If the initramfs file has an odd number of bytes, the "__initramfs_end" symbol points to an odd address, for example, the symbols in the System.map might look like: 0000000000572000 T __initramfs_start 00000000005bcd05 T __initramfs_end <-- odd address At least on s390 this causes a problem: Certain s390 instructions, especially instructions for loading addresses (larl) or branch addresses must be on even addresses. The compiler loads the symbol addresses with the "larl" instruction. This instruction sets the last bit to 0 and, therefore, for odd size files, the calculated size is one byte less than it should be: 0000000000540a9c <populate_rootfs>: 540a9c: eb cf f0 78 00 24 stmg %r12,%r15,120(%r15), 540aa2: c0 10 00 01 8a af larl %r1,572000 <__initramfs_start> 540aa8: c0 c0 00 03 e1 2e larl %r12,5bcd04 <initramfs_end> (Instead of 5bcd05) ... 540abe: 1b c1 sr %r12,%r1 To fix the problem, this patch introduces the global variable __initramfs_size, which is calculated in the "usr/initramfs_data.S" file. The populate_rootfs() function can then use the start marker of the .init.ramfs section and the value of __initramfs_size for loading the initramfs. Because the start marker and size is sufficient, the __initramfs_end symbol is no longer needed and is removed. Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Reviewed-by: WANG Cong <xiyou.wangcong@gmail.com> Acked-by: Michal Marek <mmarek@suse.cz> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michal Marek <mmarek@suse.cz>	2010-09-29 16:28:59 +02:00
Chuck Lever	56463e50d1	NFS: Use super.c for NFSROOT mount option parsing Replace duplicate code in NFSROOT for mounting an NFS server on '/' with logic that uses the existing mainline text-based logic in the NFS client. Add documenting comments where appropriate. Note that this means NFSROOT mounts now use the same default settings as v2/v3 mounts done via mount(2) from user space. vers=3,tcp,rsize=<negotiated default>,wsize=<negotiated default> As before, however, no version/protocol negotiation with the server is done. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>	2010-09-17 10:54:37 -04:00
Jens Axboe	6d0aed7a38	do_mounts: only enable PARTUUID for CONFIG_BLOCK When CONFIG_BLOCK is not enabled: init/do_mounts.c:71: error: implicit declaration of function 'dev_to_part' init/do_mounts.c:71: warning: initialization makes pointer from integer without a cast init/do_mounts.c:73: error: dereferencing pointer to incomplete type init/do_mounts.c:76: error: dereferencing pointer to incomplete type init/do_mounts.c:76: error: dereferencing pointer to incomplete type init/do_mounts.c:102: error: implicit declaration of function 'part_pack_uuid' init/do_mounts.c:104: error: 'block_class' undeclared (first use in this function) Reported-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-09-17 10:00:46 +02:00
Vivek Goyal	e43473b7f2	blkio: Core implementation of throttle policy o Actual implementation of throttling policy in block layer. Currently it implements READ and WRITE bytes per second throttling logic. IOPS throttling comes in later patches. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-09-16 08:42:52 +02:00
Jens Axboe	38b6f45a97	core: match_dev_by_uuid() should not be marked __init It is also called outside the scope of init functions. Stephen reports: WARNING: init/mounts.o(.text+0x21a): Section mismatch in reference from the function name_to_dev_t() to the function .init.text:match_dev_by_uuid() The function name_to_dev_t() references the function __init match_dev_by_uuid(). This is often because name_to_dev_t lacks a __init annotation or the annotation of match_dev_by_uuid is wrong. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-09-16 08:33:54 +02:00
Will Drewry	b5af921ec0	init: add support for root devices specified by partition UUID This is the third patch in a series which adds support for storing partition metadata, optionally, off of the hd_struct. One major use for that data is being able to resolve partition by other identities than just the index on a block device. Device enumeration varies by platform and there's a benefit to being able to use something like EFI GPT's GUIDs to determine the correct block device and partition to mount as the root. This change adds that support to root= by adding support for the following syntax: root=PARTUUID=hex-uuid Signed-off-by: Will Drewry <wad@chromium.org> Signed-off-by: Jens Axboe <jaxboe@fusionio.com>	2010-09-15 16:14:03 +02:00
Stephan Sperber	681b3049dd	Kconfig: delete duplicate word Deleted a word which apeared twice. Signed-off-by: Stephan Sperber <sperberstephan@googlemail.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>	2010-08-23 15:35:15 +02:00
Paul E. McKenney	a57eb940d1	rcu: Add a TINY_PREEMPT_RCU Implement a small-memory-footprint uniprocessor-only implementation of preemptible RCU. This implementation uses but a single blocked-tasks list rather than the combinatorial number used per leaf rcu_node by TREE_PREEMPT_RCU, which reduces memory consumption and greatly simplifies processing. This version also takes advantage of uniprocessor execution to accelerate grace periods in the case where there are no readers. The general design is otherwise broadly similar to that of TREE_PREEMPT_RCU. This implementation is a step towards having RCU implementation driven off of the SMP and PREEMPT kernel configuration variables, which can happen once this implementation has accumulated sufficient experience. Removed ACCESS_ONCE() from __rcu_read_unlock() and added barrier() as suggested by Steve Rostedt in order to avoid the compiler-reordering issue noted by Mathieu Desnoyers (http://lkml.org/lkml/2010/8/16/183). As can be seen below, CONFIG_TINY_PREEMPT_RCU represents almost 5Kbyte savings compared to CONFIG_TREE_PREEMPT_RCU. Of course, for non-real-time workloads, CONFIG_TINY_RCU is even better. CONFIG_TREE_PREEMPT_RCU text data bss dec filename 13 0 0 13 kernel/rcupdate.o 6170 825 28 7023 kernel/rcutree.o ---- 7026 Total CONFIG_TINY_PREEMPT_RCU text data bss dec filename 13 0 0 13 kernel/rcupdate.o 2081 81 8 2170 kernel/rcutiny.o ---- 2183 Total CONFIG_TINY_RCU (non-preemptible) text data bss dec filename 13 0 0 13 kernel/rcupdate.o 719 25 0 744 kernel/rcutiny.o --- 757 Total Requested-by: Loïc Minier <loic.minier@canonical.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-08-20 08:55:00 -07:00
Paul E. McKenney	4d87ffadbb	rcu: Fix RCU_FANOUT help message Commit `cf244dc01b` added a fourth level to the TREE_RCU hierarchy, but the RCU_FANOUT help message still said "cube root". This commit fixes this to "fourth root" and also emphasizes that production systems are well-served by the default. (Stress-testing RCU itself uses small RCU_FANOUT values in order to test large-system code paths on small(er) systems.) Located-by: John Kacur <jkacur@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-08-19 17:18:04 -07:00
Paul E. McKenney	687d7a960a	rcu: restrict TREE_RCU to SMP builds with !PREEMPT Because both TINY_RCU and TREE_PREEMPT_RCU have been in mainline for several releases, it is time to restrict the use of TREE_RCU to SMP non-preemptible systems. This reduces testing/validation effort. This commit is a first step towards driving the selection of RCU implementation directly off of the SMP and PREEMPT configuration parameters. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>	2010-08-19 17:18:04 -07:00
David Howells	d7627467b7	Make do_execve() take a const filename pointer Make do_execve() take a const filename pointer so that kernel_execve() compiles correctly on ARM: arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type This also requires the argv and envp arguments to be consted twice, once for the pointer array and once for the strings the array points to. This is because do_execve() passes a pointer to the filename (now const) to copy_strings_kernel(). A simpler alternative would be to cast the filename pointer in do_execve() when it's passed to copy_strings_kernel(). do_execve() may not change any of the strings it is passed as part of the argv or envp lists as they are some of them in .rodata, so marking these strings as const should be fine. Further kernel_execve() and sys_execve() need to be changed to match. This has been test built on x86_64, frv, arm and mips. Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Ralf Baechle <ralf@linux-mips.org> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-17 18:07:43 -07:00
Linus Torvalds	26df0766a7	Merge branch 'params' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus * 'params' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-for-linus: (22 commits) param: don't deref arg in __same_type() checks param: update drivers/acpi/debug.c to new scheme param: use module_param in drivers/message/fusion/mptbase.c ide: use module_param_named rather than module_param_call param: update drivers/char/ipmi/ipmi_watchdog.c to new scheme param: lock if_sdio's lbs_helper_name and lbs_fw_name against sysfs changes. param: lock myri10ge_fw_name against sysfs changes. param: simple locking for sysfs-writable charp parameters param: remove unnecessary writable charp param: add kerneldoc to moduleparam.h param: locking for kernel parameters param: make param sections const. param: use free hook for charp (fix leak of charp parameters) param: add a free hook to kernel_param_ops. param: silence .init.text references from param ops Add param ops struct for hvc_iucv driver. nfs: update for module_param_named API change AppArmor: update for module_param_named API change param: use ops in struct kernel_param, rather than get and set fns directly param: move the EXPORT_SYMBOL to after the definitions. ...	2010-08-12 10:01:59 -07:00
KAMEZAWA Hiroyuki	65e0e81166	memcg: remove experimental from swap account config It's 11 months since we changed swap_map[] to indicates SWAP_HAS_CACHE. Since that, memcg's swap accounting has been very stable and it seems it can be maintained. So, I'd like to remove EXPERIMENTAL from the config. Acked-by: Balbir Singh <balbir@linux.vnet.ibm.com> Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-11 08:59:18 -07:00
Rusty Russell	914dcaa84c	param: make param sections const. Since this section can be read-only (they're in .rodata), they should always have been const. Minor flow-through various functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Tested-by: Phil Carmody <ext-phil.2.carmody@nokia.com>	2010-08-11 23:04:19 +09:30
Linus Torvalds	8c8946f509	Merge branch 'for-linus' of git://git.infradead.org/users/eparis/notify * 'for-linus' of git://git.infradead.org/users/eparis/notify: (132 commits) fanotify: use both marks when possible fsnotify: pass both the vfsmount mark and inode mark fsnotify: walk the inode and vfsmount lists simultaneously fsnotify: rework ignored mark flushing fsnotify: remove global fsnotify groups lists fsnotify: remove group->mask fsnotify: remove the global masks fsnotify: cleanup should_send_event fanotify: use the mark in handler functions audit: use the mark in handler functions dnotify: use the mark in handler functions inotify: use the mark in handler functions fsnotify: send fsnotify_mark to groups in event handling functions fsnotify: Exchange list heads instead of moving elements fsnotify: srcu to protect read side of inode and vfsmount locks fsnotify: use an explicit flag to indicate fsnotify_destroy_mark has been called fsnotify: use _rcu functions for mark list traversal fsnotify: place marks on object in order of group memory address vfs/fsnotify: fsnotify_close can delay the final work in fput fsnotify: store struct file not struct path ... Fix up trivial delete/modify conflict in fs/notify/inotify/inotify.c.	2010-08-10 11:39:13 -07:00
Kevin Winchester	e446127134	init/main.c: mark do_one_initcall* as __init_or_module Andrew Morton suggested that the do_one_initcall and do_one_initcall_debug functions can be marked __init_or_module such that they can be discarded for the CONFIG_MODULES=N case. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-09 20:45:06 -07:00
Kevin Winchester	22c5c03b42	init/main.c: fix warning: 'calltime.tv64' may be used uninitialized Using: gcc (GCC) 4.5.0 20100610 (prerelease) The following warning appears: init/main.c: In function `do_one_initcall': init/main.c:730:10: warning: `calltime.tv64' may be used uninitialized in this function This warning is actually correct, as the global initcall_debug could arguably be changed by the initcall. Correct this warning by extracting a new function, do_one_initcall_debug, that performs the initcall for the debug case. Signed-off-by: Kevin Winchester <kjwinchester@gmail.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-08-09 20:45:06 -07:00
Linus Torvalds	78417334b5	Merge branch 'bkl/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing * 'bkl/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing: do_coredump: Do not take BKL init: Remove the BKL from startup code	2010-08-07 17:06:54 -07:00
Linus Torvalds	3b7433b8a8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (55 commits) workqueue: mark init_workqueues() as early_initcall() workqueue: explain for_each_cwq_cpu() iterators fscache: fix build on !CONFIG_SYSCTL slow-work: kill it gfs2: use workqueue instead of slow-work drm: use workqueue instead of slow-work cifs: use workqueue instead of slow-work fscache: drop references to slow-work fscache: convert operation to use workqueue instead of slow-work fscache: convert object to use workqueue instead of slow-work workqueue: fix how cpu number is stored in work->data workqueue: fix mayday_mask handling on UP workqueue: fix build problem on !CONFIG_SMP workqueue: fix locking in retry path of maybe_create_worker() async: use workqueue for worker pool workqueue: remove WQ_SINGLE_CPU and use WQ_UNBOUND instead workqueue: implement unbound workqueue workqueue: prepare for WQ_UNBOUND implementation libata: take advantage of cmwq and remove concurrency limitations workqueue: fix worker management invocation without pending works ... Fixed up conflicts in fs/cifs/ as per Tejun. Other trivial conflicts in include/linux/workqueue.h, kernel/trace/Kconfig and kernel/workqueue.c	2010-08-07 12:42:58 -07:00
Linus Torvalds	4aed2fd8e3	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (162 commits) tracing/kprobes: unregister_trace_probe needs to be called under mutex perf: expose event__process function perf events: Fix mmap offset determination perf, powerpc: fsl_emb: Restore setting perf_sample_data.period perf, powerpc: Convert the FSL driver to use local64_t perf tools: Don't keep unreferenced maps when unmaps are detected perf session: Invalidate last_match when removing threads from rb_tree perf session: Free the ref_reloc_sym memory at the right place x86,mmiotrace: Add support for tracing STOS instruction perf, sched migration: Librarize task states and event headers helpers perf, sched migration: Librarize the GUI class perf, sched migration: Make the GUI class client agnostic perf, sched migration: Make it vertically scrollable perf, sched migration: Parameterize cpu height and spacing perf, sched migration: Fix key bindings perf, sched migration: Ignore unhandled task states perf, sched migration: Handle ignored migrate out events perf: New migration tool overview tracing: Drop cpparg() macro perf: Use tracepoint_synchronize_unregister() to flush any pending tracepoint call ... Fix up trivial conflicts in Makefile and drivers/cpufreq/cpufreq.c	2010-08-06 09:30:52 -07:00
Linus Torvalds	ffd386a9a8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu: allow limited allocation before slab is online percpu: make @dyn_size always mean min dyn_size in first chunk init functions	2010-08-04 15:17:52 -07:00
Suresh Siddha	6ee0578b4d	workqueue: mark init_workqueues() as early_initcall() Mark init_workqueues() as early_initcall() and thus it will be initialized before smp bringup. init_workqueues() registers for the hotcpu notifier and thus it should cope with the processors that are brought online after the workqueues are initialized. x86 smp bringup code uses workqueues and uses a workaround for the cold boot process (as the workqueues are initialized post smp_init()). Marking init_workqueues() as early_initcall() will pave the way for cleaning up this code. Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org>	2010-08-01 13:05:29 +02:00
Eric Paris	939a67fc4c	Audit: split audit watch Kconfig Audit watch should depend on CONFIG_AUDIT_SYSCALL and should select FSNOTIFY. This splits the spagetti like mixing of audit_watch and audit_filter code so they can be configured seperately. Signed-off-by: Eric Paris <eparis@redhat.com>	2010-07-28 09:58:19 -04:00
Eric Paris	67640b602f	Audit: audit watches depend on fsnotify CONFIG_AUDIT builds audit_watches which depend on fsnotify. Make CONFIG_AUDIT select fsnotify. Reported-by: Alexander Beregalov <a.beregalov@gmail.com> Signed-off-by: Eric Paris <eparis@redhat.com>	2010-07-28 09:58:18 -04:00
Eric Paris	28a3a7eb3b	audit: reimplement audit_trees using fsnotify rather than inotify Simply switch audit_trees from using inotify to using fsnotify for it's inode pinning and disappearing act information. Signed-off-by: Eric Paris <eparis@redhat.com>	2010-07-28 09:58:17 -04:00
Tejun Heo	181a51f6e0	slow-work: kill it slow-work doesn't have any user left. Kill it. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: David Howells <dhowells@redhat.com>	2010-07-23 13:14:49 +02:00
Arnd Bergmann	5e3d20a68f	init: Remove the BKL from startup code I have shown by code review that no driver takes the BKL at init time any more, so whatever the init code was locking against is no longer there and it is now safe to remove the BKL there. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Steven Rostedt <rostedt@goodmis> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-07-09 15:40:32 +02:00
Ingo Molnar	08f8ba0799	Merge commit 'v2.6.35-rc4' into perf/core Merge reason: Pick up the latest perf fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-07-05 08:30:58 +02:00
Linus Torvalds	123f94f22e	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: sched: Cure nr_iowait_cpu() users init: Fix comment init, sched: Fix race between init and kthreadd	2010-07-02 09:52:58 -07:00
Peter Zijlstra	9715856922	init: Fix comment Apparently "pid-1" confuses people... Requested-by: Randy Dunlap <randy.dunlap@oracle.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: torvalds@linux-foundation.org Cc: randy.dunlap@oracle.com Cc: Ilya Loginov <isloginov@gmail.com> LKML-Reference: <1277887031.1868.82.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-30 10:42:44 +02:00
Thomas Gleixner	f384c954c9	Merge branch 'linus' into perf/core Reason: Further changes conflict with upstream fixes Signed-off-by: Thomas Gleixner <tglx@linutronix.de>	2010-06-28 22:33:24 +02:00
Peter Zijlstra	b433c3d454	init, sched: Fix race between init and kthreadd Ilya reported that on a very slow machine he could reliably reproduce a race between forking init and kthreadd. We first fork init so that it obtains pid-1, however since the scheduler is already fully running at this point it can preempt and run the init thread before we spawn and set kthreadd_task. The init thread can then attempt spawning kthreads without kthreadd being present which results in an OOPS. Reported-by: Ilya Loginov <isloginov@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <1277736661.3561.110.camel@laptop> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-06-28 18:21:30 +02:00
Tejun Heo	099a19d91c	percpu: allow limited allocation before slab is online This patch updates percpu allocator such that it can serve limited amount of allocation before slab comes online. This is primarily to allow slab to depend on working percpu allocator. Two parameters, PERCPU_DYNAMIC_EARLY_SIZE and SLOTS, determine how much memory space and allocation map slots are reserved. If this reserved area is exhausted, WARN_ON_ONCE() will trigger and allocation will fail till slab comes online. The following changes are made to implement early alloc. * pcpu_mem_alloc() now checks slab_is_available() * Chunks are allocated using pcpu_mem_alloc() * Init paths make sure ai->dyn_size is at least as large as PERCPU_DYNAMIC_EARLY_SIZE. * Initial alloc maps are allocated in __initdata and copied to kmalloc'd areas once slab is online. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org>	2010-06-27 18:50:00 +02:00
Ingo Molnar	646b1db495	Merge commit 'v2.6.35-rc3' into perf/core Merge reason: Go from -rc1 base to -rc3 base, merge in fixes.	2010-06-18 10:53:19 +02:00
Thomas Renninger	75cbfb97a1	ACPI: Do not try to set up acpi processor stuff on cores exceeding maxcpus= Patch is against latest Linus master branch and is expected to be safe bug fix. You get: ACPI: HARDWARE addr space,NOT supported yet for each ACPI defined CPU which status is active, but exceeds maxcpus= count. As these "not booted" CPUs do not run an idle routine and echo X >/proc/acpi/processor/*/throttling did not work I couldn't find a way to really access not onlined/booted machines. Still this should get fixed and /proc/acpi/processor/X dirs of cores exceeding maxcpus should not show up. I wonder whether this could get cleaned up by truncating possible cpu mask and nr_cpu_ids to setup_max_cpus early some day (and not exporting setup_max_cpus anymore then). But this needs touching of a lot other places... Signed-off-by: Thomas Renninger <trenn@suse.de> CC: travis@sgi.com CC: linux-acpi@vger.kernel.org CC: lenb@kernel.org Signed-off-by: Len Brown <len.brown@intel.com>	2010-06-09 18:04:12 -04:00
Li Zefan	039ca4e74a	tracing: Remove kmemtrace ftrace plugin We have been resisting new ftrace plugins and removing existing ones, and kmemtrace has been superseded by kmem trace events and perf-kmem, so we remove it. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Pekka Enberg <penberg@cs.helsinki.fi> Acked-by: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro> Cc: Ingo Molnar <mingo@elte.hu> Cc: Steven Rostedt <rostedt@goodmis.org> [ remove kmemtrace from the makefile, handle slob too ] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-06-09 17:31:22 +02:00
Américo Wang	30dbb20e68	tracing: Remove boot tracer The boot tracer is useless. It simply logs the initcalls but in fact these initcalls are also logged through printk while using the initcall_debug kernel parameter. Nobody seem to be using it so far. Then just remove it. Signed-off-by: WANG Cong <xiyou.wangcong@gmail.com> Cc: Chase Douglas <chase.douglas@canonical.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Li Zefan <lizf@cn.fujitsu.com> LKML-Reference: <20100526105753.GA5677@cr0.nay.redhat.com> [ remove the hooks in main.c, and the headers ] Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>	2010-06-08 23:31:28 +02:00
Linus Torvalds	1f73897861	Merge branch 'for-35' of git://repo.or.cz/linux-kbuild * 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits) kbuild: Revert part of `e8d400a` to resolve a conflict kbuild: Fix checking of scm-identifier variable gconfig: add support to show hidden options that have prompts menuconfig: add support to show hidden options which have prompts gconfig: remove show_debug option gconfig: remove dbg_print_ptype() and dbg_print_stype() kconfig: fix zconfdump() kconfig: some small fixes add random binaries to .gitignore kbuild: Include gen_initramfs_list.sh and the file list in the .d file kconfig: recalc symbol value before showing search results .gitignore: ignore *.lzo files headerdep: perlcritic warning scripts/Makefile.lib: Align the output of LZO kbuild: Generate modules.builtin in make modules_install Revert "kbuild: specify absolute paths for cscope" kbuild: Do not unnecessarily regenerate modules.builtin headers_install: use local file handles headers_check: fix perl warnings export_report: fix perl warnings ...	2010-06-01 08:55:52 -07:00
Haicheng Li	1f522509c7	mem-hotplug: avoid multiple zones sharing same boot strapping boot_pageset For each new populated zone of hotadded node, need to update its pagesets with dynamically allocated per_cpu_pageset struct for all possible CPUs: 1) Detach zone->pageset from the shared boot_pageset at end of __build_all_zonelists(). 2) Use mutex to protect zone->pageset when it's still shared in onlined_pages() Otherwises, multiple zones of different nodes would share same boot strapping boot_pageset for same CPU, which will finally cause below kernel panic: ------------[ cut here ]------------ kernel BUG at mm/page_alloc.c:1239! invalid opcode: 0000 [#1] SMP ... Call Trace: [<ffffffff811300c1>] __alloc_pages_nodemask+0x131/0x7b0 [<ffffffff81162e67>] alloc_pages_current+0x87/0xd0 [<ffffffff81128407>] __page_cache_alloc+0x67/0x70 [<ffffffff811325f0>] __do_page_cache_readahead+0x120/0x260 [<ffffffff81132751>] ra_submit+0x21/0x30 [<ffffffff811329c6>] ondemand_readahead+0x166/0x2c0 [<ffffffff81132ba0>] page_cache_async_readahead+0x80/0xa0 [<ffffffff8112a0e4>] generic_file_aio_read+0x364/0x670 [<ffffffff81266cfa>] nfs_file_read+0xca/0x130 [<ffffffff8117b20a>] do_sync_read+0xfa/0x140 [<ffffffff8117bf75>] vfs_read+0xb5/0x1a0 [<ffffffff8117c151>] sys_read+0x51/0x80 [<ffffffff8103c032>] system_call_fastpath+0x16/0x1b RIP [<ffffffff8112ff13>] get_page_from_freelist+0x883/0x900 RSP <ffff88000d1e78a8> ---[ end trace 4bda28328b9990db ] [akpm@linux-foundation.org: merge fix] Signed-off-by: Haicheng Li <haicheng.li@linux.intel.com> Signed-off-by: Wu Fengguang <fengguang.wu@intel.com> Reviewed-by: Andi Kleen <andi.kleen@intel.com> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Mel Gorman <mel@csn.ul.ie> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-05-25 08:07:01 -07:00
Jens Axboe	ee9a3607fb	Merge branch 'master' into for-2.6.35 Conflicts: fs/ext3/fsync.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-05-21 21:27:26 +02:00
Jason Wessel	0b4b3827db	x86, kgdb, init: Add early and late debug states The kernel debugger can operate well before mm_init(), but the x86 hardware breakpoint code which uses the perf api requires that the kernel allocators are initialized. This means the kernel debug core needs to provide an optional arch specific call back to allow the initialization functions to run after the kernel has been further initialized. The kdb shell already had a similar restriction with an early initialization and late initialization. The kdb_init() was moved into the debug core's version of the late init which is called dbg_late_init(); CC: kgdb-bugreport@lists.sourceforge.net Signed-off-by: Jason Wessel <jason.wessel@windriver.com>	2010-05-20 21:04:29 -05:00
Jason Wessel	67fc4e0cb9	kdb: core for kgdb back end (2 of 2) This patch contains the hooks and instrumentation into kernel which live outside the kernel/debug directory, which the kdb core will call to run commands like lsmod, dmesg, bt etc... CC: linux-arch@vger.kernel.org Signed-off-by: Jason Wessel <jason.wessel@windriver.com> Signed-off-by: Martin Hicks <mort@sgi.com>	2010-05-20 21:04:21 -05:00
Ingo Molnar	48652ced15	Merge commit 'v2.6.34-rc6' into sched/core	2010-05-07 11:27:54 +02:00
Jens Axboe	7407cf355f	Merge branch 'master' into for-2.6.35 Conflicts: fs/block_dev.c Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-04-29 09:36:24 +02:00
Vivek Goyal	afc24d49c1	blk-cgroup: config options re-arrangement This patch fixes few usability and configurability issues. o All the cgroup based controller options are configurable from "Genral Setup/Control Group Support/" menu. blkio is the only exception. Hence make this option visible in above menu and make it configurable from there to bring it inline with rest of the cgroup based controllers. o Get rid of CONFIG_DEBUG_CFQ_IOSCHED. This option currently does two things. - Enable printing of cgroup paths in blktrace - Enables CONFIG_DEBUG_BLK_CGROUP, which in turn displays additional stat files in cgroup. If we are using group scheduling, blktrace data is of not really much use if cgroup information is not present. To get this data, currently one has to also enable CONFIG_DEBUG_CFQ_IOSCHED, which in turn brings the overhead of all the additional debug stat files which is not desired. Hence, this patch moves printing of cgroup paths under CONFIG_CFQ_GROUP_IOSCHED. This allows us to get rid of CONFIG_DEBUG_CFQ_IOSCHED completely. Now all the debug stat files are controlled only by CONFIG_DEBUG_BLK_CGROUP which can be enabled through config menu. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Acked-by: Divyesh Shah <dpshah@google.com> Reviewed-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2010-04-26 19:27:56 +02:00
Phillip Lougher	df37bd156d	initramfs: handle unrecognised decompressor when unpacking The unpack routine fails to handle the decompress_method() returning unrecognised decompressor (compress_name == NULL). This results in the routine looping eventually oopsing on an out of bounds memory access. Note this bug is usually hidden, only triggering on trailing junk after one or more correct compressed blocks. The case of the compressed archive being complete junk is (by accident?) caught by the if (state != Reset) check because state is initialised to Start, but not updated due to the decompressor not having been called. Obviously if the junk is trailing a correctly decompressed buffer, state == Reset from the previous call to the decompressor. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk> Reported-by: Aaro Koskinen <aaro.koskinen@iki.fi> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-04-24 11:31:26 -07:00
Ingo Molnar	b257c14ceb	Merge branch 'linus' into sched/core Merge reason: merge the latest fixes, update to -rc4. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-04-15 09:36:16 +02:00
Li Zefan	32bd7eb5a7	sched: Remove remaining USER_SCHED code This is left over from commit `7c9414385e` ("sched: Remove USER_SCHED"") Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Acked-by: Dhaval Giani <dhaval.giani@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: David Howells <dhowells@redhat.com> LKML-Reference: <4BA9A05F.7010407@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-04-02 20:12:00 +02:00
Tejun Heo	5a0e3ad6af	include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h percpu.h is included by sched.h and module.h and thus ends up being included when building most .c files. percpu.h includes slab.h which in turn includes gfp.h making everything defined by the two files universally available and complicating inclusion dependencies. percpu.h -> slab.h dependency is about to be removed. Prepare for this change by updating users of gfp and slab facilities include those headers directly instead of assuming availability. As this conversion needs to touch large number of source files, the following script is used as the basis of conversion. http://userweb.kernel.org/~tj/misc/slabh-sweep.py The script does the followings. * Scan files for gfp and slab usages and update includes such that only the necessary includes are there. ie. if only gfp is used, gfp.h, if slab is used, slab.h. * When the script inserts a new include, it looks at the include blocks and try to put the new include such that its order conforms to its surrounding. It's put in the include block which contains core kernel includes, in the same order that the rest are ordered - alphabetical, Christmas tree, rev-Xmas-tree or at the end if there doesn't seem to be any matching order. * If the script can't find a place to put a new include (mostly because the file doesn't have fitting include block), it prints out an error message indicating which .h file needs to be added to the file. The conversion was done in the following steps. 1. The initial automatic conversion of all .c files updated slightly over 4000 files, deleting around 700 includes and adding ~480 gfp.h and ~3000 slab.h inclusions. The script emitted errors for ~400 files. 2. Each error was manually checked. Some didn't need the inclusion, some needed manual addition while adding it to implementation .h or embedding .c file was more appropriate for others. This step added inclusions to around 150 files. 3. The script was run again and the output was compared to the edits from #2 to make sure no file was left behind. 4. Several build tests were done and a couple of problems were fixed. e.g. lib/decompress_.c used malloc/free() wrappers around slab APIs requiring slab.h to be added manually. 5. The script was run on all .h files but without automatically editing them as sprinkling gfp.h and slab.h inclusions around .h files could easily lead to inclusion dependency hell. Most gfp.h inclusion directives were ignored as stuff from gfp.h was usually wildly available and often used in preprocessor macros. Each slab.h inclusion directive was examined and added manually as necessary. 6. percpu.h was updated not to include slab.h. 7. Build test were done on the following configurations and failures were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my distributed build env didn't work with gcov compiles) and a few more options had to be turned off depending on archs to make things build (like ipr on powerpc/64 which failed due to missing writeq). x86 and x86_64 UP and SMP allmodconfig and a custom test config. * powerpc and powerpc64 SMP allmodconfig * sparc and sparc64 SMP allmodconfig * ia64 SMP allmodconfig * s390 SMP allmodconfig * alpha SMP allmodconfig * um on x86_64 SMP allmodconfig 8. percpu.h modifications were reverted so that it could be applied as a separate patch and serve as bisection point. Given the fact that I had only a couple of failures from tests on step 6, I'm fairly confident about the coverage of this conversion patch. If there is a breakage, it's likely to be something in one of the arch headers which should be easily discoverable easily on most builds of the specific arch. Signed-off-by: Tejun Heo <tj@kernel.org> Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>	2010-03-30 22:02:32 +09:00
Miao Xie	5ab116c934	cpuset: fix the problem that cpuset_mem_spread_node() returns an offline node cpuset_mem_spread_node() returns an offline node, and causes an oops. This patch fixes it by initializing task->mems_allowed to node_states[N_HIGH_MEMORY], and updating task->mems_allowed when doing memory hotplug. Signed-off-by: Miao Xie <miaox@cn.fujitsu.com> Acked-by: David Rientjes <rientjes@google.com> Reported-by: Nick Piggin <npiggin@suse.de> Tested-by: Nick Piggin <npiggin@suse.de> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-24 16:31:21 -07:00
Kirill A. Shutemov	0dea116876	cgroup: implement eventfd-based generic API for notifications This patchset introduces eventfd-based API for notifications in cgroups and implements memory notifications on top of it. It uses statistics in memory controler to track memory usage. Output of time(1) on building kernel on tmpfs: Root cgroup before changes: make -j2 506.37 user 60.93s system 193% cpu 4:52.77 total Non-root cgroup before changes: make -j2 507.14 user 62.66s system 193% cpu 4:54.74 total Root cgroup after changes (0 thresholds): make -j2 507.13 user 62.20s system 193% cpu 4:53.55 total Non-root cgroup after changes (0 thresholds): make -j2 507.70 user 64.20s system 193% cpu 4:55.70 total Root cgroup after changes (1 thresholds, never crossed): make -j2 506.97 user 62.20s system 193% cpu 4:53.90 total Non-root cgroup after changes (1 thresholds, never crossed): make -j2 507.55 user 64.08s system 193% cpu 4:55.63 total This patch: Introduce the write-only file "cgroup.event_control" in every cgroup. To register new notification handler you need: - create an eventfd; - open a control file to be monitored. Callbacks register_event() and unregister_event() must be defined for the control file; - write "<event_fd> <control_fd> <args>" to cgroup.event_control. Interpretation of args is defined by control file implementation; eventfd will be woken up by control file implementation or when the cgroup is removed. To unregister notification handler just close eventfd. If you need notification functionality for a control file you have to implement callbacks register_event() and unregister_event() in the struct cftype. [kamezawa.hiroyu@jp.fujitsu.com: Kconfig fix] Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Balbir Singh <balbir@linux.vnet.ibm.com> Cc: Pavel Emelyanov <xemul@openvz.org> Cc: Dan Malek <dan@embeddedalley.com> Cc: Vladislav Buzov <vbuzov@embeddedalley.com> Cc: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> Cc: Alexander Shishkin <virtuoso@slind.org> Cc: Davide Libenzi <davidel@xmailserver.org> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-12 15:52:37 -08:00
H Hartley Sweeten	7463e633c5	init/main.c: make setup_max_cpus static for !SMP The only in tree external users of the symbol setup_max_cpus are in arch/x86/. The files ./kernel/alternative.c, ./kernel/visws_quirks.c, and ./mm/kmemcheck/kmemcheck.c are all guarded by CONFIG_SMP being defined. For this case the symbol is an unsigned int and declared as an extern in include/linux/smp.h. When CONFIG_SMP is not defined the symbol setup_max_cpus is a constant value that is only used in init/main.c. Make the symbol static for this case. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-06 11:26:32 -08:00
H Hartley Sweeten	8aaed5bec2	init/initramfs.c: fix "symbol shadows an earlier one" noise The symbol 'count' is a local global variable in this file. The function clean_rootfs() should use a different symbol name to prevent "symbol shadows an earlier one" noise. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-06 11:26:29 -08:00
Andreas Mohr	9a85b8d604	init/main.c: improve usability in case of init binary failure - new Documentation/init.txt file describing various forms of failure trying to load the init binary after kernel bootup - extend the init/main.c init failure message to direct to Documentation/init.txt Signed-off-by: Andreas Mohr <andi@lisas.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-06 11:26:29 -08:00
Rafael J. Wysocki	452aa6999e	mm/pm: force GFP_NOIO during suspend/hibernation and resume There are quite a few GFP_KERNEL memory allocations made during suspend/hibernation and resume that may cause the system to hang, because the I/O operations they depend on cannot be completed due to the underlying devices being suspended. Avoid this problem by clearing the __GFP_IO and __GFP_FS bits in gfp_allowed_mask before suspend/hibernation and restoring the original values of these bits in gfp_allowed_mask durig the subsequent resume. [akpm@linux-foundation.org: fix CONFIG_PM=n linkage] Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl> Reported-by: Maxim Levitsky <maximlevitsky@gmail.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-03-06 11:26:26 -08:00
Linus Torvalds	0f2cc4ecd8	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6: (52 commits) init: Open /dev/console from rootfs mqueue: fix typo "failues" -> "failures" mqueue: only set error codes if they are really necessary mqueue: simplify do_open() error handling mqueue: apply mathematics distributivity on mq_bytes calculation mqueue: remove unneeded info->messages initialization mqueue: fix mq_open() file descriptor leak on user-space processes fix race in d_splice_alias() set S_DEAD on unlink() and non-directory rename() victims vfs: add NOFOLLOW flag to umount(2) get rid of ->mnt_parent in tomoyo/realpath hppfs can use existing proc_mnt, no need for do_kern_mount() in there Mirror MS_KERNMOUNT in ->mnt_flags get rid of useless vfsmount_lock use in put_mnt_ns() Take vfsmount_lock to fs/internal.h get rid of insanity with namespace roots in tomoyo take check for new events in namespace (guts of mounts_poll()) to namespace.c Don't mess with generic_permission() under ->d_lock in hpfs sanitize const/signedness for udf nilfs: sanitize const/signedness in dealing with ->d_name.name ... Fix up fairly trivial (famous last words...) conflicts in drivers/infiniband/core/uverbs_main.c and security/tomoyo/realpath.c	2010-03-04 08:15:33 -08:00
Eric W. Biederman	2bd3a997be	init: Open /dev/console from rootfs To avoid potential problems with an empty /dev open /dev/console from rootfs instead of waiting to mount our root filesystem and mounting it there. This effectively guarantees that there will be a device node, and it won't be on a filesystem that we will ever unmount, so there are no issues with leaving /dev/console open and pinning the filesystem. This is actually more effective than automatically mounting devtmpfs on /dev because it removes removes the occasionally problematic assumption that /dev/console exists from the boot code. With this patch I was able to throw busybox on my /boot partition (which has no /dev directory) and boot into userspace without problems. The only possible negative consequence I can think of is that someone out there deliberately used did not use a character device that is major 5 minor 2 for /dev/console. Does anyone know of a situation in which that could make sense? Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-03-03 14:56:07 -05:00
Linus Torvalds	fb7b096d94	Merge branch 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-apic-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) x86: Fix out of order of gsi x86: apic: Fix mismerge, add arch_probe_nr_irqs() again x86, irq: Keep chip_data in create_irq_nr and destroy_irq xen: Remove unnecessary arch specific xen irq functions. smp: Use nr_cpus= to set nr_cpu_ids early x86, irq: Remove arch_probe_nr_irqs sparseirq: Use radix_tree instead of ptrs array sparseirq: Change irq_desc_ptrs to static init: Move radix_tree_init() early irq: Remove unnecessary bootmem code x86: Add iMac9,1 to pci_reboot_dmi_table x86: Convert i8259_lock to raw_spinlock x86: Convert nmi_lock to raw_spinlock x86: Convert ioapic_lock and vector_lock to raw_spinlock x86: Avoid race condition in pci_enable_msix() x86: Fix SCI on IOAPIC != 0 x86, ia32_aout: do not kill argument mapping x86, irq: Move __setup_vector_irq() before the first irq enable in cpu online path x86, irq: Update the vector domain for legacy irqs handled by io-apic x86, irq: Don't block IRQ0_VECTOR..IRQ15_VECTOR's on all cpu's ...	2010-03-03 08:15:37 -08:00
Linus Torvalds	f66ffdedbf	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) sched: Fix SCHED_MC regression caused by change in sched cpu_power sched: Don't use possibly stale sched_class kthread, sched: Remove reference to kthread_create_on_cpu sched: cpuacct: Use bigger percpu counter batch values for stats counters percpu_counter: Make __percpu_counter_add an inline function on UP sched: Remove member rt_se from struct rt_rq sched: Change usage of rt_rq->rt_se to rt_rq->tg->rt_se[cpu] sched: Remove unused update_shares_locked() sched: Use for_each_bit sched: Queue a deboosted task to the head of the RT prio queue sched: Implement head queueing for sched_rt sched: Extend enqueue_task to allow head queueing sched: Remove USER_SCHED sched: Fix the place where group powers are updated sched: Assume *balance is valid sched: Remove load_balance_newidle() sched: Unify load_balance{,_newidle}() sched: Add a lock break for PREEMPT=y sched: Remove from fwd decls sched: Remove rq_iterator from move_one_task ... Fix up trivial conflicts in kernel/sched.c	2010-02-28 10:31:01 -08:00
Linus Torvalds	6556a67435	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (172 commits) perf_event, amd: Fix spinlock initialization perf_event: Fix preempt warning in perf_clock() perf tools: Flush maps on COMM events perf_events, x86: Split PMU definitions into separate files perf annotate: Handle samples not at objdump output addr boundaries perf_events, x86: Remove superflous MSR writes perf_events: Simplify code by removing cpu argument to hw_perf_group_sched_in() perf_events, x86: AMD event scheduling perf_events: Add new start/stop PMU callbacks perf_events: Report the MMAP pgoff value in bytes perf annotate: Defer allocating sym_priv->hist array perf symbols: Improve debugging information about symtab origins perf top: Use a macro instead of a constant variable perf symbols: Check the right return variable perf/scripts: Tag syscall_name helper as not yet available perf/scripts: Add perf-trace-python Documentation perf/scripts: Remove unnecessary PyTuple resizes perf/scripts: Add syscall tracing scripts perf/scripts: Add Python scripting engine perf/scripts: Remove check-perf-trace from listed scripts ... Fix trivial conflict in tools/perf/util/probe-event.c	2010-02-28 10:20:25 -08:00
Linus Torvalds	d25e8dbdab	Merge branch 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'oprofile-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: oprofile/x86: fix msr access to reserved counters oprofile/x86: use kzalloc() instead of kmalloc() oprofile/x86: fix perfctr nmi reservation for mulitplexing oprofile/x86: add comment to counter-in-use warning oprofile/x86: warn user if a counter is already active oprofile/x86: implement randomization for IBS periodic op counter oprofile/x86: implement lsfr pseudo-random number generator for IBS oprofile/x86: implement IBS cpuid feature detection oprofile/x86: remove node check in AMD IBS initialization oprofile/x86: remove OPROFILE_IBS config option oprofile: remove EXPERIMENTAL from the config option description oprofile: remove tracing build dependency	2010-02-28 10:15:31 -08:00
Linus Torvalds	642c4c75a7	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (44 commits) rcu: Fix accelerated GPs for last non-dynticked CPU rcu: Make non-RCU_PROVE_LOCKING rcu_read_lock_sched_held() understand boot rcu: Fix accelerated grace periods for last non-dynticked CPU rcu: Export rcu_scheduler_active rcu: Make rcu_read_lock_sched_held() take boot time into account rcu: Make lockdep_rcu_dereference() message less alarmist sched, cgroups: Fix module export rcu: Add RCU_CPU_STALL_VERBOSE to dump detailed per-task information rcu: Fix rcutorture mod_timer argument to delay one jiffy rcu: Fix deadlock in TREE_PREEMPT_RCU CPU stall detection rcu: Convert to raw_spinlocks rcu: Stop overflowing signed integers rcu: Use canonical URL for Mathieu's dissertation rcu: Accelerate grace period if last non-dynticked CPU rcu: Fix citation of Mathieu's dissertation rcu: Documentation update for CONFIG_PROVE_RCU security: Apply lockdep-based checking to rcu_dereference() uses idr: Apply lockdep-based diagnostics to rcu_dereference() uses radix-tree: Disable RCU lockdep checking in radix tree vfs: Abstract rcu_dereference_check for files-fdtable use ...	2010-02-28 10:13:16 -08:00
Linus Torvalds	37d4008484	Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (31 commits) crypto: aes_generic - Fix checkpatch errors crypto: fcrypt - Fix checkpatch errors crypto: ecb - Fix checkpatch errors crypto: des_generic - Fix checkpatch errors crypto: deflate - Fix checkpatch errors crypto: crypto_null - Fix checkpatch errors crypto: cipher - Fix checkpatch errors crypto: crc32 - Fix checkpatch errors crypto: compress - Fix checkpatch errors crypto: cast6 - Fix checkpatch errors crypto: cast5 - Fix checkpatch errors crypto: camellia - Fix checkpatch errors crypto: authenc - Fix checkpatch errors crypto: api - Fix checkpatch errors crypto: anubis - Fix checkpatch errors crypto: algapi - Fix checkpatch errors crypto: blowfish - Fix checkpatch errors crypto: aead - Fix checkpatch errors crypto: ablkcipher - Fix checkpatch errors crypto: pcrypt - call the complete function on error ...	2010-02-26 16:50:02 -08:00
Robert Richter	b309a294e5	oprofile: remove EXPERIMENTAL from the config option description OProfile is already used for a long time and no longer experimental. Signed-off-by: Robert Richter <robert.richter@amd.com>	2010-02-26 15:13:54 +01:00
Paul E. McKenney	8bd93a2c5d	rcu: Accelerate grace period if last non-dynticked CPU Currently, rcu_needs_cpu() simply checks whether the current CPU has an outstanding RCU callback, which means that the last CPU to go into dyntick-idle mode might wait a few ticks for the relevant grace periods to complete. However, if all the other CPUs are in dyntick-idle mode, and if this CPU is in a quiescent state (which it is for RCU-bh and RCU-sched any time that we are considering going into dyntick-idle mode), then the grace period is instantly complete. This patch therefore repeatedly invokes the RCU grace-period machinery in order to force any needed grace periods to complete quickly. It does so a limited number of times in order to prevent starvation by an RCU callback function that might pass itself to call_rcu(). However, if any CPU other than the current one is not in dyntick-idle mode, fall back to simply checking (with fix to bug noted by Lai Jiangshan). Also, take advantage of last grace-period forcing, the opportunity to do so noted by Steve Rostedt. And apply simplified #ifdef condition suggested by Frederic Weisbecker. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-15-git-send-email-paulmck@linux.vnet.ibm.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 10:34:55 +01:00
Paul E. McKenney	d11c563dd2	sched: Use lockdep-based checking on rcu_dereference() Update the rcu_dereference() usages to take advantage of the new lockdep-based checking. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: laijs@cn.fujitsu.com Cc: dipankar@in.ibm.com Cc: mathieu.desnoyers@polymtl.ca Cc: josh@joshtriplett.org Cc: dvhltc@us.ibm.com Cc: niv@us.ibm.com Cc: peterz@infradead.org Cc: rostedt@goodmis.org Cc: Valdis.Kletnieks@vt.edu Cc: dhowells@redhat.com LKML-Reference: <1266887105-1528-6-git-send-email-paulmck@linux.vnet.ibm.com> [ -v2: fix allmodconfig missing symbol export build failure on x86 ] Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-25 10:34:26 +01:00
Yinghai Lu	2b633e3fac	smp: Use nr_cpus= to set nr_cpu_ids early On x86, before prefill_possible_map(), nr_cpu_ids will be NR_CPUS aka CONFIG_NR_CPUS. Add nr_cpus= to set nr_cpu_ids. so we can simulate cpus <=8 are installed on normal config. -v2: accordging to Christoph, acpi_numa_init should use nr_cpu_ids in stead of NR_CPUS. -v3: add doc in kernel-parameters.txt according to Andrew. Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1265793639-15071-34-git-send-email-yinghai@kernel.org> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Cc: Tony Luck <tony.luck@intel.com>	2010-02-17 17:30:22 -08:00
Yinghai Lu	773e3eb7b8	init: Move radix_tree_init() early Prepare for using radix trees in early_irq_init(). Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <1265793639-15071-30-git-send-email-yinghai@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2010-02-17 17:26:33 -08:00
Ingo Molnar	6d3e0907b8	Merge branch 'sched/urgent' into sched/core Merge reason: Merge dependent fix, update to latest -rc. Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-02-08 08:55:46 +01:00
Eric Paris	54bb6552bd	ima: initialize ima before inodes can be allocated ima wants to create an inode information struct (iint) when inodes are allocated. This means that at least the part of ima which does this allocation (the allocation is filled with information later) should before any inodes are created. To accomplish this we split the ima initialization routine placing the kmem cache allocator inside a security_initcall() function. Since this makes use of radix trees we also need to make sure that is initialized before security_initcall(). Signed-off-by: Eric Paris <eparis@redhat.com> Acked-by: Mimi Zohar <zohar@linux.vnet.ibm.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2010-02-07 03:06:22 -05:00
Roland McGrath	8433646601	kconfig CROSS_COMPILE option This adds CROSS_COMPILE as a kconfig string so you can store it in .config. Then you can use plain "make" in the configured kernel build directory to do the right cross compilation without setting the command-line or environment variable every time. With this, you can set up different build directories for different kernel configurations, whether native or cross-builds, and then use the simple: make -C /build/dir M=module-source-dir idiom to build modules for any given target kernel, indicating which one by nothing but the build directory chosen. I tried a version that defaults the string with env="CROSS_COMPILE" so that in a "make oldconfig" with CROSS_COMPILE in the environment you can just hit return to store the way you're building it. But the kconfig prompt for strings doesn't give you any way to say you want an empty string instead of the default, so I punted that. Signed-off-by: Roland McGrath <roland@redhat.com> Cc: Sam Ravnborg <sam@ravnborg.org> Cc: Anibal Monsalve Salazar <anibal@debian.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Michal Marek <mmarek@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Michal Marek <mmarek@suse.cz>	2010-02-02 14:33:54 +01:00
Dhaval Giani	7c9414385e	sched: Remove USER_SCHED Remove the USER_SCHED feature. It has been scheduled to be removed in 2.6.34 as per http://marc.info/?l=linux-kernel&m=125728479022976&w=2 Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1263990378.24844.3.camel@localhost> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-01-21 13:40:18 +01:00
Ingo Molnar	61405fea92	Merge branch 'perf/urgent' into perf/core Merge reason: queue up dependent patch, update to -rc4 Signed-off-by: Ingo Molnar <mingo@elte.hu>	2010-01-13 10:08:50 +01:00
Albin Tonnerre	7dd65feb6c	lib: add support for LZO-compressed kernels This patch series adds generic support for creating and extracting LZO-compressed kernel images, as well as support for using such images on the x86 and ARM architectures, and support for creating and using LZO-compressed initrd and initramfs images. Russell King said: : Testing on a Cortex A9 model: : - lzo decompressor is 65% of the time gzip takes to decompress a kernel : - lzo kernel is 9% larger than a gzip kernel : : which I'm happy to say confirms your figures when comparing the two. : : However, when comparing your new gzip code to the old gzip code: : - new is 99% of the size of the old code : - new takes 42% of the time to decompress than the old code : : What this means is that for a proper comparison, the results get even better: : - lzo is 7.5% larger than the old gzip'd kernel image : - lzo takes 28% of the time that the old gzip code took : : So the expense seems definitely worth the effort. The only reason I : can think of ever using gzip would be if you needed the additional : compression (eg, because you have limited flash to store the image.) : : I would argue that the default for ARM should therefore be LZO. This patch: The lzo compressor is worse than gzip at compression, but faster at extraction. Here are some figures for an ARM board I'm working on: Uncompressed size: 3.24Mo gzip 1.61Mo 0.72s lzo 1.75Mo 0.48s So for a compression ratio that is still relatively close to gzip, it's much faster to extract, at least in that case. This part contains: - Makefile routine to support lzo compression - Fixes to the existing lzo compressor so that it can be used in compressed kernels - wrapper around the existing lzo1x_decompress, as it only extracts one block at a time, while we need to extract a whole file here - config dialog for kernel compression [akpm@linux-foundation.org: coding-style fixes] [akpm@linux-foundation.org: cleanup] Signed-off-by: Albin Tonnerre <albin.tonnerre@free-electrons.com> Tested-by: Wu Zhangjin <wuzhangjin@gmail.com> Acked-by: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Tested-by: Russell King <rmk@arm.linux.org.uk> Acked-by: Russell King <rmk@arm.linux.org.uk> Cc: Ralf Baechle <ralf@linux-mips.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2010-01-11 09:34:04 -08:00
Steffen Klassert	16295bec63	padata: Generic parallelization/serialization interface This patch introduces an interface to process data objects in parallel. The parallelized objects return after serialization in the same order as they were before the parallelization. Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2010-01-06 19:47:10 +11:00
Li Zefan	07b139c8c8	perf events: Remove CONFIG_EVENT_PROFILE Quoted from Ingo: \| This reminds me - i think we should eliminate CONFIG_EVENT_PROFILE - \| it's an unnecessary Kconfig complication. If both PERF_EVENTS and \| EVENT_TRACING is enabled we should expose generic tracepoints. \| \| Nor is it limited to event 'profiling', so it has become a misnomer as \| well. Signed-off-by: Li Zefan <lizf@cn.fujitsu.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Paul Mackerras <paulus@samba.org> LKML-Reference: <4B2F1557.2050705@cn.fujitsu.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-28 10:33:06 +01:00
Linus Torvalds	3981e15286	Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, irq: Allow 0xff for /proc/irq/[n]/smp_affinity on an 8-cpu system Makefile: Unexport LC_ALL instead of clearing it x86: Fix objdump version check in arch/x86/tools/chkobjdump.awk x86: Reenable TSC sync check at boot, even with NONSTOP_TSC x86: Don't use POSIX character classes in gen-insn-attr-x86.awk Makefile: set LC_CTYPE, LC_COLLATE, LC_NUMERIC to C x86: Increase MAX_EARLY_RES; insufficient on 32-bit NUMA x86: Fix checking of SRAT when node 0 ram is not from 0 x86, cpuid: Add "volatile" to asm in native_cpuid() x86, msr: msrs_alloc/free for CONFIG_SMP=n x86, amd: Get multi-node CPU info from NodeId MSR instead of PCI config space x86: Add IA32_TSC_AUX MSR and use it x86, msr/cpuid: Register enough minors for the MSR and CPUID drivers initramfs: add missing decompressor error check bzip2: Add missing checks for malloc returning NULL bzip2/lzma/gzip: pre-boot malloc doesn't return NULL on failure	2009-12-19 09:48:14 -08:00
Linus Torvalds	aac3d39693	Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (25 commits) sched: Fix broken assertion sched: Assert task state bits at build time sched: Update task_state_arraypwith new states sched: Add missing state chars to TASK_STATE_TO_CHAR_STR sched: Move TASK_STATE_TO_CHAR_STR near the TASK_state bits sched: Teach might_sleep() about preemptible RCU sched: Make warning less noisy sched: Simplify set_task_cpu() sched: Remove the cfs_rq dependency from set_task_cpu() sched: Add pre and post wakeup hooks sched: Move kthread_bind() back to kthread.c sched: Fix select_task_rq() vs hotplug issues sched: Fix sched_exec() balancing sched: Ensure set_task_cpu() is never called on blocked tasks sched: Use TASK_WAKING for fork wakups sched: Select_task_rq_fair() must honour SD_LOAD_BALANCE sched: Fix task_hot() test order sched: Fix set_cpu_active() in cpu_down() sched: Mark boot-cpu active before smp_init() sched: Fix cpu_clock() in NMIs, on !CONFIG_HAVE_UNSTABLE_SCHED_CLOCK ...	2009-12-19 09:47:49 -08:00
Linus Torvalds	5a865c0606	Merge branch 'for-33' of git://repo.or.cz/linux-kbuild * 'for-33' of git://repo.or.cz/linux-kbuild: (29 commits) net: fix for utsrelease.h moving to generated gen_init_cpio: fixed fwrite warning kbuild: fix make clean after mismerge kbuild: generate modules.builtin genksyms: properly consider EXPORT_UNUSED_SYMBOL{,_GPL}() score: add asm/asm-offsets.h wrapper unifdef: update to upstream revision 1.190 kbuild: specify absolute paths for cscope kbuild: create include/generated in silentoldconfig scripts/package: deb-pkg: use fakeroot if available scripts/package: add KBUILD_PKG_ROOTCMD variable scripts/package: tar-pkg: use tar --owner=root Kbuild: clean up marker net: add net_tstamp.h to headers_install kbuild: move utsrelease.h to include/generated kbuild: move autoconf.h to include/generated drop explicit include of autoconf.h kbuild: move compile.h to include/generated kbuild: drop include/asm kbuild: do not check for include/asm-$ARCH ... Fixed non-conflicting clean merge of modpost.c as per comments from Stephen Rothwell (modpost.c had grown an include of linux/autoconf.h that needed to be changed to generated/autoconf.h)	2009-12-17 07:23:42 -08:00
Peter Zijlstra	933b0618d8	sched: Mark boot-cpu active before smp_init() A UP machine has 1 active cpu, not having the boot-cpu in the active map when starting the scheduler confuses things. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mike Galbraith <efault@gmx.de> LKML-Reference: <20091216170517.423469527@chello.nl> Signed-off-by: Ingo Molnar <mingo@elte.hu>	2009-12-16 19:01:53 +01:00
Phillip Lougher	54291362d2	initramfs: add missing decompressor error check The decompressors return error by calling a supplied error function, and/or by returning an error return value. The initramfs code, however, fails to check the exit code returned by the decompressor, and only checks the error status set by calling the error function. This patch adds a return code check and calls the error function. Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk> LKML-Reference: <4b26b1ef.0+ZWxT6886olqcSc%phillip@lougher.demon.co.uk> Signed-off-by: H. Peter Anvin <hpa@zytor.com>	2009-12-15 14:04:24 -08:00
H Hartley Sweeten	196a15b4ee	init/main.c: fix symbol shadows noise The symbol 'call' is a static symbol used for initcall_debug. This same symbol name is used locally by a couple functions and produces the following sparse warnings: warning: symbol 'call' shadows an earlier one Fix this noise by renaming the local symbols. Signed-off-by: H Hartley Sweeten <hsweeten@visionengravers.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-15 08:53:26 -08:00
Jie Zhang	ea63763959	nommu: fix malloc performance by adding uninitialized flag The NOMMU code currently clears all anonymous mmapped memory. While this is what we want in the default case, all memory allocation from userspace under NOMMU has to go through this interface, including malloc() which is allowed to return uninitialized memory. This can easily be a significant performance penalty. So for constrained embedded systems were security is irrelevant, allow people to avoid clearing memory unnecessarily. This also alters the ELF-FDPIC binfmt such that it obtains uninitialised memory for the brk and stack region. Signed-off-by: Jie Zhang <jie.zhang@analog.com> Signed-off-by: Robin Getz <rgetz@blackfin.uclinux.org> Signed-off-by: Mike Frysinger <vapier@gentoo.org> Signed-off-by: David Howells <dhowells@redhat.com> Acked-by: Paul Mundt <lethal@linux-sh.org> Acked-by: Greg Ungerer <gerg@snapgear.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-15 08:53:24 -08:00
Sam Ravnborg	273b281fa2	kbuild: move utsrelease.h to include/generated Fix up all users of utsrelease.h Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Michal Marek <mmarek@suse.cz>	2009-12-12 13:08:15 +01:00
Sam Ravnborg	9204595405	kbuild: move compile.h to include/generated Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: Michal Marek <mmarek@suse.cz>	2009-12-12 13:08:14 +01:00
Linus Torvalds	60d8ce2cd6	Merge branch 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: timers, init: Limit the number of per cpu calibration bootup messages posix-cpu-timers: optimize and document timer_create callback clockevents: Add missing include to pacify sparse x86: vmiclock: Fix printk format x86: Fix printk format due to variable type change sparc: fix printk for change of variable type clocksource/events: Fix fallout of generic code changes nohz: Allow 32-bit machines to sleep for more than 2.15 seconds nohz: Track last do_timer() cpu nohz: Prevent clocksource wrapping during idle nohz: Type cast printk argument mips: Use generic mult/shift factor calculation for clocks clocksource: Provide a generic mult/shift factor calculation clockevents: Use u32 for mult and shift factors nohz: Introduce arch_needs_cpu nohz: Reuse ktime in sub-functions of tick_check_idle. time: Remove xtime_cache time: Implement logarithmic time accumulation	2009-12-08 19:27:08 -08:00
Uwe Kleine-König	9e9868a737	sysfs: deprecated features are to help old tools not to confuse them Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Randy Dunlap <randy.dunlap@oracle.com> Cc: Greg KH <gregkh@suse.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Kay Sievers <kay.sievers@vrfy.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-08 13:23:39 -08:00
Linus Torvalds	1557d33007	Merge git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6 * git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/sysctl-2.6: (43 commits) security/tomoyo: Remove now unnecessary handling of security_sysctl. security/tomoyo: Add a special case to handle accesses through the internal proc mount. sysctl: Drop & in front of every proc_handler. sysctl: Remove CTL_NONE and CTL_UNNUMBERED sysctl: kill dead ctl_handler definitions. sysctl: Remove the last of the generic binary sysctl support sysctl net: Remove unused binary sysctl code sysctl security/tomoyo: Don't look at ctl_name sysctl arm: Remove binary sysctl support sysctl x86: Remove dead binary sysctl support sysctl sh: Remove dead binary sysctl support sysctl powerpc: Remove dead binary sysctl support sysctl ia64: Remove dead binary sysctl support sysctl s390: Remove dead sysctl binary support sysctl frv: Remove dead binary sysctl support sysctl mips/lasat: Remove dead binary sysctl support sysctl drivers: Remove dead binary sysctl support sysctl crypto: Remove dead binary sysctl support sysctl security/keys: Remove dead binary sysctl support sysctl kernel: Remove binary sysctl logic ...	2009-12-08 07:38:50 -08:00
Linus Torvalds	607781762e	Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits) rcu: Make RCU's CPU-stall detector be default rcu: Add expedited grace-period support for preemptible RCU rcu: Enable fourth level of TREE_RCU hierarchy rcu: Rename "quiet" functions rcu: Re-arrange code to reduce #ifdef pain rcu: Eliminate unneeded function wrapping rcu: Fix grace-period-stall bug on large systems with CPU hotplug rcu: Eliminate __rcu_pending() false positives rcu: Further cleanups of use of lastcomp rcu: Simplify association of forced quiescent states with grace periods rcu: Accelerate callback processing on CPUs not detecting GP end rcu: Mark init-time-only rcu_bootup_announce() as __init rcu: Simplify association of quiescent states with grace periods rcu: Rename dynticks_completed to completed_fqs rcu: Enable synchronize_sched_expedited() fastpath rcu: Remove inline from forward-referenced functions rcu: Fix note_new_gpnum() uses of ->gpnum rcu: Fix synchronization for rcu_process_gp_end() uses of ->completed counter rcu: Prepare for synchronization fixes: clean up for non-NO_HZ handling of ->completed counter rcu: Cleanup: balance rcu_irq_enter()/rcu_irq_exit() calls ...	2009-12-05 09:52:14 -08:00
Linus Torvalds	3e72b810e3	Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: mutex: Fix missing conditions to build mutex_spin_on_owner() mutex: Better control mutex adaptive spinning config locking, task_struct: Reduce size on TRACE_IRQFLAGS and 64bit locking: Use __[SPIN\|RW]_LOCK_UNLOCKED in [spin\|rw]_lock_init() locking: Remove unused prototype locking: Reduce ifdefs in kernel/spinlock.c locking: Make inlining decision Kconfig based	2009-12-05 09:49:59 -08:00
Rusty Russell	f066a4f6df	param: don't complain about unused module parameters. Jon confirms that recent modprobe will look in /proc/cmdline, so these cmdline options can still be used. See http://bugzilla.kernel.org/show_bug.cgi?id=14164 Reported-by: Adam Williamson <awilliam@redhat.com> Cc: stable@kernel.org Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2009-12-02 15:40:07 -08:00

... 7 8 9 10 11 ...

1513 Commits