linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-30 08:56:45 +07:00

Author	SHA1	Message	Date
Jayachandran C	55c25c2f14	MIPS: mm: Move some checks out of 'for' loop in DMA operations The check cpu_needs_post_dma_flush() in mips_dma_sync_sg_for_cpu() and the check !plat_device_is_coherent() in mips_dma_sync_sg_for_device() can be moved outside the for loop. As a side effect, this also avoids a GCC bug that caused kernel compile to fail with the error: arch/mips/mm/dma-default.c: In function 'mips_dma_sync_sg_for_cpu': arch/mips/mm/dma-default.c:316:1: internal compiler error: in add_insn_before, at emit-rtl.c:3852 This gcc failure is seen in Code Sourcery toolchains [e.g. gcc version 4.7.2 (Sourcery CodeBench Lite 2012.09-99)] after commit "MIPS: Optimize current_cpu_type() for better code." Signed-off-by: Jayachandran C <jchandra@broadcom.com> Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/5907/ Reviewed-by: Markos Chandras <markos.chandras@imgtec.com> Tested-by: Markos Chandras <markos.chandras@imgtec.com> Signed-off-by: Ralf Baechle <ralf@linux-mips.org>	2013-09-25 17:05:44 +02:00
Konrad Rzeszutek Wilk	15a3eac078	xen/spinlock: Document the xen_nopvspin parameter. Which disables in the ticketlock slowpath the Xen PV optimization's. Useful for diagnosing issues and comparing benchmarks in over-commit CPU scenarios. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-09-25 10:07:34 -04:00
David Vrabel	0160676bba	xen/p2m: check MFN is in range before using the m2p table On hosts with more than 168 GB of memory, a 32-bit guest may attempt to grant map an MFN that is error cannot lookup in its mapping of the m2p table. There is an m2p lookup as part of m2p_add_override() and m2p_remove_override(). The lookup falls off the end of the mapped portion of the m2p and (because the mapping is at the highest virtual address) wraps around and the lookup causes a fault on what appears to be a user space address. do_page_fault() (thinking it's a fault to a userspace address), tries to lock mm->mmap_sem. If the gntdev device is used for the grant map, m2p_add_override() is called from from gnttab_mmap() with mm->mmap_sem already locked. do_page_fault() then deadlocks. The deadlock would most commonly occur when a 64-bit guest is started and xenconsoled attempts to grant map its console ring. Introduce mfn_to_pfn_no_overrides() which checks the MFN is within the mapped portion of the m2p table before accessing the table and use this in m2p_add_override(), m2p_remove_override(), and mfn_to_pfn() (which already had the correct range check). All faults caused by accessing the non-existant parts of the m2p are thus within the kernel address space and exception_fixup() is called without trying to lock mm->mmap_sem. This means that for MFNs that are outside the mapped range of the m2p then mfn_to_pfn() will always look in the m2p overrides. This is correct because it must be a foreign MFN (and the PFN in the m2p in this case is only relevant for the other domain). Signed-off-by: David Vrabel <david.vrabel@citrix.com> Cc: Stefano Stabellini <stefano.stabellini@citrix.com> Cc: Jan Beulich <JBeulich@suse.com> -- v3: check for auto_translated_physmap in mfn_to_pfn_no_overrides() v2: in mfn_to_pfn() look in m2p_overrides if the MFN is out of range as it's probably foreign. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>	2013-09-25 09:00:03 -04:00
Dave Jones	7a20c2fad6	x86/reboot: Fix apparent cut-n-paste mistake in Dell reboot workaround This seems to have been copied from the Optiplex 990 entry above, but somoene forgot to change the ident text. Signed-off-by: Dave Jones <davej@fedoraproject.org> Link: http://lkml.kernel.org/r/20130925001344.GA13554@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-09-25 08:41:10 +02:00
Benjamin Herrenschmidt	dbe78b4011	powerpc/pseries: Do not start secondaries in Open Firmware Starting secondary CPUs early on from Open Firmware and placing them in a holding spin loop slows down the boot process significantly under some hypervisors such as KVM. This is also unnecessary when RTAS supports querying the CPU state So let's not do it. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-09-25 14:19:00 +10:00
Benjamin Herrenschmidt	0c9fa29149	powerpc/zImage: make the "OF" wrapper support ePAPR boot This makes the "OF" zImage wrapper (zImage.pseries, zImage.pmac, zImage.maple) work if booted via a flat device-tree (ePAPR boot mode), and thus potentially usable with kexec. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-09-25 14:18:44 +10:00
Benjamin Herrenschmidt	cbc9565ee8	powerpc: Remove ksp_limit on ppc64 We've been keeping that field in thread_struct for a while, it contains the "limit" of the current stack pointer and is meant to be used for detecting stack overflows. It has a few problems however: - First, it was never actually used on 64-bit. Set and updated but not actually exploited - When switching stack to/from irq and softirq stacks, it's update is racy unless we hard disable interrupts, which is costly. This is fine on 32-bit as we don't soft-disable there but not on 64-bit. Thus rather than fixing 2 in order to implement 1 in some hypothetical future, let's remove the code completely from 64-bit. In order to avoid a clutter of ifdef's, we remove the updates from C code completely during interrupt stack switching, and instead maintain it from the asm helper that is used to do the stack switching in the first place. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-09-25 14:15:51 +10:00
Benjamin Herrenschmidt	0366a1c70b	powerpc/irq: Run softirqs off the top of the irq stack Nowadays, irq_exit() calls __do_softirq() pretty much directly instead of calling do_softirq() which switches to the decicated softirq stack. This has lead to observed stack overflows on powerpc since we call irq_enter() and irq_exit() outside of the scope that switches to the irq stack. This fixes it by moving the stack switching up a level, making irq_enter() and irq_exit() run off the irq stack. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-09-25 14:15:36 +10:00
Paul E. McKenney	22356f447c	mm: Place preemption point in do_mlockall() loop There is a loop in do_mlockall() that lacks a preemption point, which means that the following can happen on non-preemptible builds of the kernel. Dave Jones reports: "My fuzz tester keeps hitting this. Every instance shows the non-irq stack came in from mlockall. I'm only seeing this on one box, but that has more ram (8gb) than my other machines, which might explain it. INFO: rcu_preempt self-detected stall on CPU { 3} (t=6500 jiffies g=470344 c=470343 q=0) sending NMI to all CPUs: NMI backtrace for cpu 3 CPU: 3 PID: 29664 Comm: trinity-child2 Not tainted 3.11.0-rc1+ #32 Call Trace: lru_add_drain_all+0x15/0x20 SyS_mlockall+0xa5/0x1a0 tracesys+0xdd/0xe2" This commit addresses this problem by inserting the required preemption point. Reported-by: Dave Jones <davej@redhat.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com> Cc: Michel Lespinasse <walken@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 19:44:40 -07:00
Dinh Nguyen	53126a2f67	dts: Fix misspelling of Synopsys s/Synopsis/Synopsys s/synopsis/synopsys Signed-off-by: Dinh Nguyen <dinguyen@altera.com> Cc: Pavel Machek <pavel@denx.de> CC: Arnd Bergmann <arnd@arndb.de> CC: Olof Johansson <olof@lixom.net> Cc: Rob Herring <rob.herring@calxeda.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Stephen Warren <swarren@wwwdotorg.org> Cc: Ian Campbell <ian.campbell@citrix.com> Cc: Chris Ball <cjb@laptop.org> Cc: Jaehoon Chung <jh80.chung@samsung.com> Cc: Seungwon Jeon <tgih.jun@samsung.com> Cc: Tomasz Figa <tomasz.figa@gmail.com> Cc: devicetree@vger.kernel.org Cc: linux-mmc@vger.kernel.org CC: linux-arm-kernel@lists.infradead.org Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Rob Herring <rob.herring@calxeda.com>	2013-09-24 21:13:38 -05:00
Rob Herring	b0b8c960ff	of: clean-up ifdefs in of_irq.h Much of of_irq.h is needlessly ifdef'ed. Clean this up and minimize the amount ifdef'ed code. This fixes some build warnings when CONFIG_OF is not enabled (seen on i386 and x86_64): include/linux/of_irq.h:82:7: warning: 'struct device_node' declared inside parameter list [enabled by default] include/linux/of_irq.h:82:7: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default] include/linux/of_irq.h:87:47: warning: 'struct device_node' declared inside parameter list [enabled by default] Compile tested on i386, sparc and arm. Reported-by: Randy Dunlap <rdunlap@infradead.org> Cc: Grant Likely <grant.likely@linaro.org> Signed-off-by: Rob Herring <rob.herring@calxeda.com>	2013-09-24 21:12:32 -05:00
Rob Herring	ede2033c40	openrisc: clean-up prom.h Clean-up some copy/paste declarations that are not necessary. All the functions either don't exist or are already declared in other headers. This is needed in preparation of of_irq.h clean-up. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: linux@lists.openrisc.net	2013-09-24 21:12:27 -05:00
Sachin Kamat	116decb7e4	cpufreq: exynos5440: Fix potential NULL pointer dereference If 'dvfs_info' is NULL (due to devm_kzalloc failure) the failure error message would try to dereference it. Use 'pdev' instead. Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-09-25 03:25:58 +02:00
Viresh Kumar	26ca869434	cpufreq: check cpufreq driver is valid and cpufreq isn't disabled in cpufreq_get() cpufreq_get() can be called from external drivers which might not be aware if cpufreq driver is registered or not. And so we should actually check if cpufreq driver is registered or not and also if cpufreq is active or disabled, at the beginning of cpufreq_get(). Otherwise call to lock_policy_rwsem_read() might hit BUG_ON(!policy). Reported-and-tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-09-25 03:24:02 +02:00
Yinghai Lu	8a61e12e84	acpi-cpufreq: skip loading acpi_cpufreq after intel_pstate If the hw supports intel_pstate and acpi_cpufreq, intel_pstate will get loaded first. acpi_cpufreq_init() will call acpi_cpufreq_early_init() and that will allocate perf data and init those perf data in ACPI core, (that will cover all CPUs). But later it will free them as cpufreq_register_driver(acpi_cpufreq) will fail as intel_pstate is already registered Use cpufreq_get_current_driver() to check if we can skip the acpi_cpufreq loading. Signed-off-by: Yinghai Lu <yinghai@kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-09-25 03:19:09 +02:00
Lv Zheng	06a8566bcf	ACPI / IPMI: Fix atomic context requirement of ipmi_msg_handler() This patch fixes the issues indicated by the test results that ipmi_msg_handler() is invoked in atomic context. BUG: scheduling while atomic: kipmi0/18933/0x10000100 Modules linked in: ipmi_si acpi_ipmi ... CPU: 3 PID: 18933 Comm: kipmi0 Tainted: G AW 3.10.0-rc7+ #2 Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.0027.070120100606 07/01/2010 ffff8838245eea00 ffff88103fc63c98 ffffffff814c4a1e ffff88103fc63ca8 ffffffff814bfbab ffff88103fc63d28 ffffffff814c73e0 ffff88103933cbd4 0000000000000096 ffff88103fc63ce8 ffff88102f618000 ffff881035c01fd8 Call Trace: <IRQ> [<ffffffff814c4a1e>] dump_stack+0x19/0x1b [<ffffffff814bfbab>] __schedule_bug+0x46/0x54 [<ffffffff814c73e0>] __schedule+0x83/0x59c [<ffffffff81058853>] __cond_resched+0x22/0x2d [<ffffffff814c794b>] _cond_resched+0x14/0x1d [<ffffffff814c6d82>] mutex_lock+0x11/0x32 [<ffffffff8101e1e9>] ? __default_send_IPI_dest_field.constprop.0+0x53/0x58 [<ffffffffa09e3f9c>] ipmi_msg_handler+0x23/0x166 [ipmi_si] [<ffffffff812bf6e4>] deliver_response+0x55/0x5a [<ffffffff812c0fd4>] handle_new_recv_msgs+0xb67/0xc65 [<ffffffff81007ad1>] ? read_tsc+0x9/0x19 [<ffffffff814c8620>] ? _raw_spin_lock_irq+0xa/0xc [<ffffffffa09e1128>] ipmi_thread+0x5c/0x146 [ipmi_si] ... Also Tony Camuso says: We were getting occasional "Scheduling while atomic" call traces during boot on some systems. Problem was first seen on a Cisco C210 but we were able to reproduce it on a Cisco c220m3. Setting CONFIG_LOCKDEP and LOCKDEP_SUPPORT to 'y' exposed a lockdep around tx_msg_lock in acpi_ipmi.c struct acpi_ipmi_device. ================================= [ INFO: inconsistent lock state ] 2.6.32-415.el6.x86_64-debug-splck #1 --------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/3/17 [HC0[0]:SC1[1]:HE1:SE0] takes: (&ipmi_device->tx_msg_lock){+.?...}, at: [<ffffffff81337a27>] ipmi_msg_handler+0x71/0x126 {SOFTIRQ-ON-W} state was registered at: [<ffffffff810ba11c>] __lock_acquire+0x63c/0x1570 [<ffffffff810bb0f4>] lock_acquire+0xa4/0x120 [<ffffffff815581cc>] __mutex_lock_common+0x4c/0x400 [<ffffffff815586ea>] mutex_lock_nested+0x4a/0x60 [<ffffffff8133789d>] acpi_ipmi_space_handler+0x11b/0x234 [<ffffffff81321c62>] acpi_ev_address_space_dispatch+0x170/0x1be The fix implemented by this change has been tested by Tony: Tested the patch in a boot loop with lockdep debug enabled and never saw the problem in over 400 reboots. Reported-and-tested-by: Tony Camuso <tcamuso@redhat.com> Signed-off-by: Lv Zheng <lv.zheng@intel.com> Reviewed-by: Huang Ying <ying.huang@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2013-09-25 03:12:05 +02:00
Linus Torvalds	a153e67bda	Merge branch 'akpm' (patches from Andrew Morton) Merge fixes from Andrew Morton: "Bunch of fixes. And a reversion of mhocko's "Soft limit rework" patch series. This is actually your fault for opening the merge window when I was off racing ;) I didn't read the email thread before sending everything off. Johannes Weiner raised significant issues: http://www.spinics.net/lists/cgroups/msg08813.html and we agreed to back it all out" I clearly need to be more aware of Andrew's racing schedule. * akpm: MAINTAINERS: update mach-bcm related email address checkpatch: make extern in .h prototypes quieter cciss: fix info leak in cciss_ioctl32_passthru() cpqarray: fix info leak in ida_locked_ioctl() kernel/reboot.c: re-enable the function of variable reboot_default audit: fix endless wait in audit_log_start() revert "memcg, vmscan: integrate soft reclaim tighter with zone shrinking code" revert "memcg: get rid of soft-limit tree infrastructure" revert "vmscan, memcg: do softlimit reclaim also for targeted reclaim" revert "memcg: enhance memcg iterator to support predicates" revert "memcg: track children in soft limit excess to improve soft limit" revert "memcg, vmscan: do not attempt soft limit reclaim if it would not scan anything" revert "memcg: track all children over limit in the root" revert "memcg, vmscan: do not fall into reclaim-all pass too quickly" fs/ocfs2/super.c: use a bigger nodestr in ocfs2_dismount_volume watchdog: update watchdog_thresh properly watchdog: update watchdog attributes atomically	2013-09-24 17:00:35 -07:00
Christian Daudt	497a045d13	MAINTAINERS: update mach-bcm related email address Update email address on mach-bcm + drivers for Broadcom mobile SoCs. Signed-off-by: Christian Daudt <csd@broadcom.com> Cc: Olof Johansson <olof@lixom.net> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Stephen Warren <swarren@wwwdotorg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Joe Perches	d1d85780dd	checkpatch: make extern in .h prototypes quieter The use of extern in .h files is a bit contentious. Make the warning be emitted only when --strict is used on the command line. Signed-off-by: Joe Perches <joe@perches.com> Cc: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Dan Carpenter	58f09e00ae	cciss: fix info leak in cciss_ioctl32_passthru() The arg64 struct has a hole after ->buf_size which isn't cleared. Or if any of the calls to copy_from_user() fail then that would cause an information leak as well. This was assigned CVE-2013-2147. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Dan Carpenter	627aad1c01	cpqarray: fix info leak in ida_locked_ioctl() The pciinfo struct has a two byte hole after ->dev_fn so stack information could be leaked to the user. This was assigned CVE-2013-2147. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Mike Miller <mike.miller@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Chuansheng Liu	e2f0b88e84	kernel/reboot.c: re-enable the function of variable reboot_default Commit `1b3a5d02ee` ("reboot: move arch/x86 reboot= handling to generic kernel") did some cleanup for reboot= command line, but it made the reboot_default inoperative. The default value of variable reboot_default should be 1, and if command line reboot= is not set, system will use the default reboot mode. [akpm@linux-foundation.org: fix comment layout] Signed-off-by: Li Fei <fei.li@intel.com> Signed-off-by: liu chuansheng <chuansheng.liu@intel.com> Acked-by: Robin Holt <robinmholt@linux.com> Cc: <stable@vger.kernel.org> [3.11.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Konstantin Khlebnikov	8ac1c8d5de	audit: fix endless wait in audit_log_start() After commit `829199197a` ("kernel/audit.c: avoid negative sleep durations") audit emitters will block forever if userspace daemon cannot handle backlog. After the timeout the waiting loop turns into busy loop and runs until daemon dies or returns back to work. This is a minimal patch for that bug. Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org> Cc: Luiz Capitulino <lcapitulino@redhat.com> Cc: Richard Guy Briggs <rgb@redhat.com> Cc: Eric Paris <eparis@redhat.com> Cc: Chuck Anderson <chuck.anderson@oracle.com> Cc: Dan Duval <dan.duval@oracle.com> Cc: Dave Kleikamp <dave.kleikamp@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Andrew Morton	0608f43da6	revert "memcg, vmscan: integrate soft reclaim tighter with zone shrinking code" Revert commit `3b38722efd` ("memcg, vmscan: integrate soft reclaim tighter with zone shrinking code") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Andrew Morton	bb4cc1a8b5	revert "memcg: get rid of soft-limit tree infrastructure" Revert commit `e883110aad` ("memcg: get rid of soft-limit tree infrastructure") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Andrew Morton	b1aff7fcf8	revert "vmscan, memcg: do softlimit reclaim also for targeted reclaim" Revert commit `a5b7c87f92` ("vmscan, memcg: do softlimit reclaim also for targeted reclaim") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Andrew Morton	694fbc0fe7	revert "memcg: enhance memcg iterator to support predicates" Revert commit `de57780dc6` ("memcg: enhance memcg iterator to support predicates") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:26 -07:00
Andrew Morton	30361e51ca	revert "memcg: track children in soft limit excess to improve soft limit" Revert commit `7d910c054b` ("memcg: track children in soft limit excess to improve soft limit") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Andrew Morton	3120055e86	revert "memcg, vmscan: do not attempt soft limit reclaim if it would not scan anything" Revert commit `e839b6a1c8` ("memcg, vmscan: do not attempt soft limit reclaim if it would not scan anything") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Andrew Morton	8f939a9f4c	revert "memcg: track all children over limit in the root" Revert commit `1be171d60b` ("memcg: track all children over limit in the root") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Andrew Morton	20ba27f52e	revert "memcg, vmscan: do not fall into reclaim-all pass too quickly" Revert commit `e975de998b` ("memcg, vmscan: do not fall into reclaim-all pass too quickly") I merged this prematurely - Michal and Johannes still disagree about the overall design direction and the future remains unclear. Cc: Michal Hocko <mhocko@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Goldwyn Rodrigues	99d7a8824a	fs/ocfs2/super.c: use a bigger nodestr in ocfs2_dismount_volume While printing 32-bit node numbers, an 8-byte string is not enough. Increase the size of the string to 12 chars. This got left out in commit `49fa8140e4` ("fs/ocfs2/super.c: Use bigger nodestr to accomodate 32-bit node numbers"). Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mfasheh@suse.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Michal Hocko	9809b18fcf	watchdog: update watchdog_thresh properly watchdog_tresh controls how often nmi perf event counter checks per-cpu hrtimer_interrupts counter and blows up if the counter hasn't changed since the last check. The counter is updated by per-cpu watchdog_hrtimer hrtimer which is scheduled with 2/5 watchdog_thresh period which guarantees that hrtimer is scheduled 2 times per the main period. Both hrtimer and perf event are started together when the watchdog is enabled. So far so good. But... But what happens when watchdog_thresh is updated from sysctl handler? proc_dowatchdog will set a new sampling period and hrtimer callback (watchdog_timer_fn) will use the new value in the next round. The problem, however, is that nobody tells the perf event that the sampling period has changed so it is ticking with the period configured when it has been set up. This might result in an ear ripping dissonance between perf and hrtimer parts if the watchdog_thresh is increased. And even worse it might lead to KABOOM if the watchdog is configured to panic on such a spurious lockup. This patch fixes the issue by updating both nmi perf even counter and hrtimers if the threshold value has changed. The nmi one is disabled and then reinitialized from scratch. This has an unpleasant side effect that the allocation of the new event might fail theoretically so the hard lockup detector would be disabled for such cpus. On the other hand such a memory allocation failure is very unlikely because the original event is deallocated right before. It would be much nicer if we just changed perf event period but there doesn't seem to be any API to do that right now. It is also unfortunate that perf_event_alloc uses GFP_KERNEL allocation unconditionally so we cannot use on_each_cpu() and do the same thing from the per-cpu context. The update from the current CPU should be safe because perf_event_disable removes the event atomically before it clears the per-cpu watchdog_ev so it cannot change anything under running handler feet. The hrtimer is simply restarted (thanks to Don Zickus who has pointed this out) if it is queued because we cannot rely it will fire&adopt to the new sampling period before a new nmi event triggers (when the treshold is decreased). [akpm@linux-foundation.org: the UP version of __smp_call_function_single ended up in the wrong place] Signed-off-by: Michal Hocko <mhocko@suse.cz> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Cc: Fabio Estevam <festevam@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Michal Hocko	359e6fab66	watchdog: update watchdog attributes atomically proc_dowatchdog doesn't synchronize multiple callers which might lead to confusion when two parallel callers might confuse watchdog_enable_all_cpus resp watchdog_disable_all_cpus (eg watchdog gets enabled even if watchdog_thresh was set to 0 already). This patch adds a local mutex which synchronizes callers to the sysctl handler. Signed-off-by: Michal Hocko <mhocko@suse.cz> Cc: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Don Zickus <dzickus@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 17:00:25 -07:00
Linus Torvalds	e288e931c1	Merge branch 'bcache' (bcache fixes from Kent Overstreet) Merge bcache fixes from Kent Overstreet: "There's fixes for _three_ different data corruption bugs, all of which were found by users hitting them in the wild. The first one isn't bcache specific - in 3.11 bcache was switched to the bio_copy_data in fs/bio.c, and that's when the bug in that code was discovered, but it's also used by raid1 and pktcdvd. (That was my code too, so the bug's doubly embarassing given that it was or should've been just a cut and paste from bcache code. Dunno what happened there). Most of these (all the non data corruption bugs, actually) were ready before the merge window and have been sitting in Jens' tree, but I don't know what's been up with him lately..." * emailed patches from Kent Overstreet <kmo@daterainc.com>: bcache: Fix flushes in writeback mode bcache: Fix for handling overlapping extents when reading in a btree node bcache: Fix a shrinker deadlock bcache: Fix a dumb CPU spinning bug in writeback bcache: Fix a flush/fua performance bug bcache: Fix a writeback performance regression bcache: Correct printf()-style format length modifier bcache: Fix for when no journal entries are found bcache: Strip endline when writing the label through sysfs bcache: Fix a dumb journal discard bug block: Fix bio_copy_data()	2013-09-24 14:42:03 -07:00
Kent Overstreet	c0f04d88e4	bcache: Fix flushes in writeback mode In writeback mode, when we get a cache flush we need to make sure we issue a flush to the backing device. The code for sending down an extra flush was wrong - by cloning the bio we were probably getting flags that didn't make sense for a bare flush, and also the old code was firing for FUA bios, for which we don't need to send a flush to the backing device. This was causing data corruption somehow - the mechanism was never determined, but this patch fixes it for the users that were seeing it. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Kent Overstreet	84786438ed	bcache: Fix for handling overlapping extents when reading in a btree node btree_sort_fixup() was overly clever, because it was trying to avoid pulling a key off the btree iterator in more than one place. This led to a really obscure bug where we'd break early from the loop in btree_sort_fixup() if the current key overlapped with keys in more than one older set, and the next key it overlapped with was zero size. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Kent Overstreet	a698e08c82	bcache: Fix a shrinker deadlock GFP_NOIO means we could be getting called recursively - mca_alloc() -> mca_data_alloc() - definitely can't use mutex_lock(bucket_lock) then. Whoops. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Kent Overstreet	79e3dab90d	bcache: Fix a dumb CPU spinning bug in writeback schedule_timeout() != schedule_timeout_uninterruptible() Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Kent Overstreet	1394d6761b	bcache: Fix a flush/fua performance bug bch_journal_meta() was missing the flush to make the journal write actually go down (instead of waiting up to journal_delay_ms)... Whoops Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Kent Overstreet	c2a4f3183a	bcache: Fix a writeback performance regression Background writeback works by scanning the btree for dirty data and adding those keys into a fixed size buffer, then for each dirty key in the keybuf writing it to the backing device. When read_dirty() finishes and it's time to scan for more dirty data, we need to wait for the outstanding writeback IO to finish - they still take up slots in the keybuf (so that foreground writes can check for them to avoid races) - without that wait, we'll continually rescan when we'll be able to add at most a key or two to the keybuf, and that takes locks that starves foreground IO. Doh. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Geert Uytterhoeven	61cbd250f8	bcache: Correct printf()-style format length modifier Fix drivers/md/bcache/btree.c: In function ‘bch_btree_node_read’: drivers/md/bcache/btree.c:259: warning: format ‘%lu’ expects type ‘long unsigned int’, but argument 3 has type ‘size_t’ Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Kent Overstreet <kmo@daterainc.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Kent Overstreet	c426c4fd46	bcache: Fix for when no journal entries are found The journal replay code didn't handle this case, causing it to go into an infinite loop... Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Gabriel de Perthuis	aee6f1cfff	bcache: Strip endline when writing the label through sysfs sysfs attributes with unusual characters have crappy failure modes in Squeeze (udev 164); later versions of udev are unaffected. This should make these characters more unusual. Signed-off-by: Gabriel de Perthuis <g2p.code@gmail.com> Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Kent Overstreet	6d9d21e35f	bcache: Fix a dumb journal discard bug That switch statement was obviously wrong, leading to some sort of weird spinning on rare occasion with discards enabled... Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: linux-stable <stable@vger.kernel.org> # >= v3.10 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:43 -07:00
Kent Overstreet	2f6cf0de02	block: Fix bio_copy_data() The memcpy() in bio_copy_data() was using the wrong offset vars, leading to data corruption in weird unusual setups. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: linux-stable <stable@vger.kernel.org> # >= v3.9 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2013-09-24 14:41:42 -07:00
David Vrabel	24f69373e2	xen/balloon: don't alloc page while non-preemptible get_balloon_scratch_page() disables preemption so we cannot call alloc_page() in between get/put_balloon_scratch_page(). Shuffle bits around in decrease_reservation() to avoid this. Signed-off-by: David Vrabel <david.vrabel@citrix.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-09-24 16:22:27 -04:00
Konrad Rzeszutek Wilk	a945928ea2	xen: Do not enable spinlocks before jump_label_init() has executed xen_init_spinlocks() currently calls static_key_slow_inc() before jump_label_init() is invoked. When CONFIG_JUMP_LABEL is set (which usually is the case) the effect of this static_key_slow_inc() is deferred until after jump_label_init(). This is different from when CONFIG_JUMP_LABEL is not set, in which case the key is set immediately. Thus, depending on the value of config option, we may observe different behavior. In addition, when we come to __jump_label_transform() from jump_label_init(), the key (paravirt_ticketlocks_enabled) is already enabled. On processors where ideal_nop is not the same as default_nop this will cause a BUG() since it is expected that before a key is enabled the latter is replaced by the former during initialization. To address this problem we need to move static_key_slow_inc(&paravirt_ticketlocks_enabled) so that it is called after jump_label_init(). We also need to make sure that this is done before other cpus start to boot. early_initcall appears to be a good place to do so. (Note that we cannot move whole xen_init_spinlocks() there since pv_lock_ops need to be set before alternative_instructions() runs.) Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> [v2: Added extra comments in the code] Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reviewed-by: Steven Rostedt <rostedt@goodmis.org>	2013-09-24 16:22:26 -04:00
Jason Gunthorpe	bf4a7c054b	tpm: xen-tpmfront: Remove the locality sysfs attribute Upon deeper review it was agreed to remove the driver-unique 'locality' sysfs attribute before it is present in a released kernel. The attribute was introduced in `e2683957fb` during the 3.12 merge window, so this patch needs to go in before 3.12 is released. The hope is to have a well defined locality API that all the other locality aware drivers can use, perhaps in 3.13. Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>	2013-09-24 16:15:15 -04:00
Jason Gunthorpe	56be88954b	tpm: xen-tpmfront: Fix default durations All the default durations were being set to 10 minutes which is way too long for the timeouts. Normal values for the longest duration are around 5 mins, and short duration ar around .5s. Further, these are just the default, tpm_get_timeouts will set them to values from the TPM (or throw an error). Just remove them. Acked-by: Daniel De Graaf <dgdegra@tycho.nsa.gov> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Signed-off-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>	2013-09-24 16:14:55 -04:00

... 2 3 4 5 6 ...

400161 Commits