linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-23 13:32:50 +07:00

Author	SHA1	Message	Date
Claudio Scordino	e97a90f706	sched/cpufreq: Rate limits for SCHED_DEADLINE When the SCHED_DEADLINE scheduling class increases the CPU utilization, it should not wait for the rate limit, otherwise it may miss some deadline. Tests using rt-app on Exynos5422 with up to 10 SCHED_DEADLINE tasks have shown reductions of even 10% of deadline misses with a negligible increase of energy consumption (measured through Baylibre Cape). Signed-off-by: Claudio Scordino <claudio@evidence.eu.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: linux-pm@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Todd Kjos <tkjos@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Link: https://lkml.kernel.org/r/1520937340-2755-1-git-send-email-claudio@evidence.eu.com	2018-03-23 22:48:22 +01:00
Patrick Bellasi	d519329f72	sched/fair: Update util_est only on util_avg updates The estimated utilization of a task is currently updated every time the task is dequeued. However, to keep overheads under control, PELT signals are effectively updated at maximum once every 1ms. Thus, for really short running tasks, it can happen that their util_avg value has not been updates since their last enqueue. If such tasks are also frequently running tasks (e.g. the kind of workload generated by hackbench) it can also happen that their util_avg is updated only every few activations. This means that updating util_est at every dequeue potentially introduces not necessary overheads and it's also conceptually wrong if the util_avg signal has never been updated during a task activation. Let's introduce a throttling mechanism on task's util_est updates to sync them with util_avg updates. To make the solution memory efficient, both in terms of space and load/store operations, we encode a synchronization flag into the LSB of util_est.enqueued. This makes util_est an even values only metric, which is still considered good enough for its purpose. The synchronization bit is (re)set by __update_load_avg_se() once the PELT signal of a task has been updated during its last activation. Such a throttling mechanism allows to keep under control util_est overheads in the wakeup hot path, thus making it a suitable mechanism which can be enabled also on high-intensity workload systems. Thus, this now switches on by default the estimation utilization scheduler feature. Suggested-by: Chris Redpath <chris.redpath@arm.com> Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@android.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: http://lkml.kernel.org/r/20180309095245.11071-5-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:11:09 +01:00
Patrick Bellasi	a07630b8b2	sched/cpufreq/schedutil: Use util_est for OPP selection When schedutil looks at the CPU utilization, the current PELT value for that CPU is returned straight away. In certain scenarios this can have undesired side effects and delays on frequency selection. For example, since the task utilization is decayed at wakeup time, a long sleeping big task newly enqueued does not add immediately a significant contribution to the target CPU. This introduces some latency before schedutil will be able to detect the best frequency required by that task. Moreover, the PELT signal build-up time is a function of the current frequency, because of the scale invariant load tracking support. Thus, starting from a lower frequency, the utilization build-up time will increase even more and further delays the selection of the actual frequency which better serves the task requirements. In order to reduce these kind of latencies, we integrate the usage of the CPU's estimated utilization in the sugov_get_util function. This allows to properly consider the expected utilization of a CPU which, for example, has just got a big task running after a long sleep period. Ultimately this allows to select the best frequency to run a task right after its wake-up. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steve Muckle <smuckle@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@android.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Link: http://lkml.kernel.org/r/20180309095245.11071-4-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:11:08 +01:00
Patrick Bellasi	f9be3e5961	sched/fair: Use util_est in LB and WU paths When the scheduler looks at the CPU utilization, the current PELT value for a CPU is returned straight away. In certain scenarios this can have undesired side effects on task placement. For example, since the task utilization is decayed at wakeup time, when a long sleeping big task is enqueued it does not add immediately a significant contribution to the target CPU. As a result we generate a race condition where other tasks can be placed on the same CPU while it is still considered relatively empty. In order to reduce this kind of race conditions, this patch introduces the required support to integrate the usage of the CPU's estimated utilization in the wakeup path, via cpu_util_wake(), as well as in the load-balance path, via cpu_util() which is used by update_sg_lb_stats(). The estimated utilization of a CPU is defined to be the maximum between its PELT's utilization and the sum of the estimated utilization (at previous dequeue time) of all the tasks currently RUNNABLE on that CPU. This allows to properly represent the spare capacity of a CPU which, for example, has just got a big task running since a long sleep period. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@android.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: http://lkml.kernel.org/r/20180309095245.11071-3-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:11:07 +01:00
Patrick Bellasi	7f65ea42eb	sched/fair: Add util_est on top of PELT The util_avg signal computed by PELT is too variable for some use-cases. For example, a big task waking up after a long sleep period will have its utilization almost completely decayed. This introduces some latency before schedutil will be able to pick the best frequency to run a task. The same issue can affect task placement. Indeed, since the task utilization is already decayed at wakeup, when the task is enqueued in a CPU, this can result in a CPU running a big task as being temporarily represented as being almost empty. This leads to a race condition where other tasks can be potentially allocated on a CPU which just started to run a big task which slept for a relatively long period. Moreover, the PELT utilization of a task can be updated every [ms], thus making it a continuously changing value for certain longer running tasks. This means that the instantaneous PELT utilization of a RUNNING task is not really meaningful to properly support scheduler decisions. For all these reasons, a more stable signal can do a better job of representing the expected/estimated utilization of a task/cfs_rq. Such a signal can be easily created on top of PELT by still using it as an estimator which produces values to be aggregated on meaningful events. This patch adds a simple implementation of util_est, a new signal built on top of PELT's util_avg where: util_est(task) = max(task::util_avg, f(task::util_avg@dequeue)) This allows to remember how big a task has been reported by PELT in its previous activations via f(task::util_avg@dequeue), which is the new _task_util_est(struct task_struct*) function added by this patch. If a task should change its behavior and it runs longer in a new activation, after a certain time its util_est will just track the original PELT signal (i.e. task::util_avg). The estimated utilization of cfs_rq is defined only for root ones. That's because the only sensible consumer of this signal are the scheduler and schedutil when looking for the overall CPU utilization due to FAIR tasks. For this reason, the estimated utilization of a root cfs_rq is simply defined as: util_est(cfs_rq) = max(cfs_rq::util_avg, cfs_rq::util_est::enqueued) where: cfs_rq::util_est::enqueued = sum(_task_util_est(task)) for each RUNNABLE task on that root cfs_rq It's worth noting that the estimated utilization is tracked only for objects of interests, specifically: - Tasks: to better support tasks placement decisions - root cfs_rqs: to better support both tasks placement decisions as well as frequencies selection Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@android.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: http://lkml.kernel.org/r/20180309095245.11071-2-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:11:06 +01:00
Ingo Molnar	10c18c44a6	Merge branch 'linus' into sched/core, to pick up fixes Signed-off-by: Ingo Molnar <mingo@kernel.org>	2018-03-20 08:08:02 +01:00
Linus Torvalds	1b5f3ba415	Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: "Two commits to fix the following subtle cgroup2 behavior bugs: - cpu.max was rejecting config when it shouldn't - thread mode enable was allowed when it shouldn't" * 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup: fix rule checking for threaded mode switching sched, cgroup: Don't reject lower cpu.max on ancestors	2018-03-19 15:39:02 -07:00
Linus Torvalds	c6256ca9c0	Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq Pull workqueue fixes from Tejun Heo: "Two low-impact workqueue commits. One fixes workqueue creation error path and the other removes the unused cancel_work()" * 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: remove unused cancel_work() workqueue: use put_device() instead of kfree()	2018-03-19 15:13:04 -07:00
Linus Torvalds	0d707a2f24	Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu Pull percpu fixes from Tejun Heo: "Late percpu pull request for v4.16-rc6. - percpu allocator pool replenishing no longer triggers OOM or warning messages. Also, the alloc interface now understands __GFP_NORETRY and __GFP_NOWARN. This is to allow avoiding OOMs from userland triggered actions like bpf map creation. Also added cond_resched() in alloc loop. - perpcu allocation now can be interrupted by kill sigs to avoid deadlocking OOM killer. - Added Dennis Zhou as a co-maintainer. He has rewritten the area map allocator, understands most of the code base and has been responsive for all bug reports" * 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn() percpu: include linux/sched.h for cond_resched() percpu: add a schedule point in pcpu_balance_workfn() percpu: allow select gfp to be passed to underlying allocators percpu: add __GFP_NORETRY semantics to the percpu balancing path percpu: match chunk allocator declarations with definitions percpu: add Dennis Zhou as a percpu co-maintainer	2018-03-19 14:48:35 -07:00
Linus Torvalds	efac2483e8	Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata Pull libata fixes from Tejun Heo: "I sat on them too long and it's quite a few this late, but nothing has a wide blast area. The changes are... - Fix corner cases in SG command handling. - Recent introduction of default powersaving mode config option exposed several devices with broken powersaving behaviors. A number of patches to update the blacklist accordingly. - Fix a kernel panic on SAS hotplug. - Other misc and device specific updates" * 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata: libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs libata: Enable queued TRIM for Samsung SSD 860 PCI: Add function 1 DMA alias quirk for Highpoint RocketRAID 644L ahci: Add PCI-id for the Highpoint Rocketraid 644L card ata: do not schedule hot plug if it is a sas host libata: disable LPM for Crucial BX100 SSD 500GB drive libata: Apply NOLPM quirk to Crucial MX100 512GB SSDs libata: update documentation for sysfs interfaces ata: sata_rcar: Remove unused variable in sata_rcar_init_controller() libata: transport: cleanup documentation of sysfs interface sata_rcar: Reset SATA PHY when Salvator-X board resumes libata: don't try to pass through NCQ commands to non-NCQ devices libata: remove WARN() for DMA or PIO command without data libata: fix length validation of ATAPI-relayed SCSI commands ata: libahci: fix comment indentation ahci: Add check for device presence (PCIe hot unplug) in ahci_stop_engine() libata: Fix compile warning with ATA_DEBUG enabled	2018-03-19 14:23:30 -07:00
Tejun Heo	b3a5d11199	percpu_ref: Update doc to dissuade users from depending on internal RCU grace periods percpu_ref internally uses sched-RCU to implement the percpu -> atomic mode switching and the documentation suggested that this could be depended upon. This doesn't seem like a good idea. * percpu_ref uses sched-RCU which has different grace periods regular RCU. Users may combine percpu_ref with regular RCU usage and incorrectly believe that regular RCU grace periods are performed by percpu_ref. This can lead to, for example, use-after-free due to premature freeing. * percpu_ref has a grace period when switching from percpu to atomic mode. It doesn't have one between the last put and release. This distinction is subtle and can lead to surprising bugs. * percpu_ref allows starting in and switching to atomic mode manually for debugging and other purposes. This means that there may not be any grace periods from kill to release. This patch makes it clear that the grace periods are percpu_ref's internal implementation detail and can't be depended upon by the users. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-03-19 10:09:44 -07:00
Kirill Tkhai	f52ba1fef7	mm: Allow to kill tasks doing pcpu_alloc() and waiting for pcpu_balance_workfn() In case of memory deficit and low percpu memory pages, pcpu_balance_workfn() takes pcpu_alloc_mutex for a long time (as it makes memory allocations itself and waits for memory reclaim). If tasks doing pcpu_alloc() are choosen by OOM killer, they can't exit, because they are waiting for the mutex. The patch makes pcpu_alloc() to care about killing signal and use mutex_lock_killable(), when it's allowed by GFP flags. This guarantees, a task does not miss SIGKILL from OOM killer. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-03-19 09:38:50 -07:00
Tejun Heo	71546d1004	percpu: include linux/sched.h for cond_resched() microblaze build broke due to missing declaration of the cond_resched() invocation added recently. Let's include linux/sched.h explicitly. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: kbuild test robot <fengguang.wu@intel.com>	2018-03-19 09:38:18 -07:00
Hans de Goede	d418ff56b8	libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version When commit `9c7be59fc5` ("libata: Apply NOLPM quirk to Crucial MX100 512GB SSDs") was added it inherited the ATA_HORKAGE_NO_NCQ_TRIM quirk from the existing "Crucial_CTMX100" entry, but that entry sets model_rev to "MU01", where as the entry adding the NOLPM quirk sets it to NULL. This means that after this commit we no apply the NO_NCQ_TRIM quirk to all "Crucial_CT512MX100" SSDs even if they have the fixed "MU02" firmware. This commit splits the "Crucial_CT512MX100" quirk into 2 quirks, one for the "MU01" firmware and one for all other firmware versions, so that we once again only apply the NO_NCQ_TRIM quirk to the "MU01" firmware version. Fixes: `9c7be59fc5` ("libata: Apply NOLPM quirk to ... MX100 512GB SSDs") Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-03-19 08:36:38 -07:00
Hans de Goede	3bf7b5d6d0	libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions Commit `b17e5729a6` ("libata: disable LPM for Crucial BX100 SSD 500GB drive"), introduced a ATA_HORKAGE_NOLPM quirk for Crucial BX100 500GB SSDs but limited this to the MU02 firmware version, according to: http://www.crucial.com/usa/en/support-ssd-firmware MU02 is the last version, so there are no newer possibly fixed versions and if the MU02 version has broken LPM then the MU01 almost certainly also has broken LPM, so this commit changes the quirk to apply to all firmware versions. Fixes: `b17e5729a6` ("libata: disable LPM for Crucial BX100 SSD 500GB...") Cc: stable@vger.kernel.org Cc: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-03-19 08:36:38 -07:00
Hans de Goede	62ac3f7305	libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs There have been reports of the Crucial M500 480GB model not working with LPM set to min_power / med_power_with_dipm level. It has not been tested with medium_power, but that typically has no measurable power-savings. Note the reporters Crucial_CT480M500SSD3 has a firmware version of MU03 and there is a MU05 update available, but that update does not mention any LPM fixes in its changelog, so the quirk matches all firmware versions. In my experience the LPM problems with (older) Crucial SSDs seem to be limited to higher capacity versions of the SSDs (different firmware?), so this commit adds a NOLPM quirk for the 480 and 960GB versions of the M500, to avoid LPM causing issues with these SSDs. Cc: stable@vger.kernel.org Reported-and-tested-by: Martin Steigerwald <martin@lichtvoll.de> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>	2018-03-19 08:36:38 -07:00
Linus Torvalds	c698ca5278	Linux 4.16-rc6	2018-03-18 17:48:42 -07:00
Linus Torvalds	9e1909b9da	Merge branch 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/pti updates from Thomas Gleixner: "Another set of melted spectrum updates: - Iron out the last late microcode loading issues by actually checking whether new microcode is present and preventing the CPU synchronization to run into a timeout induced hang. - Remove Skylake C2 from the microcode blacklist according to the latest Intel documentation - Fix the VM86 POPF emulation which traps if VIP is set, but VIF is not. Enhance the selftests to catch that kind of issue - Annotate indirect calls/jumps for objtool on 32bit. This is not a functional issue, but for consistency sake its the right thing to do. - Fix a jump label build warning observed on SPARC64 which uses 32bit storage for the code location which is casted to 64 bit pointer w/o extending it to 64bit first. - Add two new cpufeature bits. Not really an urgent issue, but provides them for both x86 and x86/kvm work. No impact on the current kernel" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/microcode: Fix CPU synchronization routine x86/microcode: Attempt late loading only when new microcode is present x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist jump_label: Fix sparc64 warning x86/speculation, objtool: Annotate indirect calls/jumps for objtool on 32-bit kernels x86/vm86/32: Fix POPF emulation selftests/x86/entry_from_vm86: Add test cases for POPF selftests/x86/entry_from_vm86: Exit with 1 if we fail x86/cpufeatures: Add Intel PCONFIG cpufeature x86/cpufeatures: Add Intel Total Memory Encryption cpufeature	2018-03-18 12:03:15 -07:00
Linus Torvalds	df4fe17802	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fix from Thomas Gleixner: "A single fix for vmalloc_fault() which uses pd_huge() unconditionally whether CONFIG_HUGETLBFS is set or not. In case of CONFIG_HUGETLBFS=n this results in a crash as pd_huge() returns 0 in that case" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Fix vmalloc_fault to use pXd_large	2018-03-18 12:01:14 -07:00
Linus Torvalds	d2149e13e5	Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Thomas Gleixner: "Three fixes for irq chip drivers: - Make sure the allocations in the GIC-V3 ITS driver are large enough to accomodate the interrupt space - Fix a misplaced __iomem annotation which causes a splat of 26 sparse warnings - Remove an unused function in the IMX GPCV2 driver which causes build warnings" * 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip/irq-imx-gpcv2: Remove unused function irqchip/gic-v3-its: Ensure nr_ites >= nr_lpis irqchip/gic-v3-its: Fix misplaced __iomem annotations	2018-03-18 11:59:14 -07:00
Linus Torvalds	23fe85ae4f	Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fix from Thomas Gleixner: "A single fix to prevent partially initialized pointers in mixed mode (64bit kernel on 32bit UEFI)" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi/libstub/tpm: Initialize pointer variables to zero for mixed mode	2018-03-18 11:56:53 -07:00
Linus Torvalds	3cd1d3273f	* PPC: Fix bug leading to lost IPIs and smp_call_function_many() lockups on POWER9. * ARM: locking fix, reset fix, GICv2 multi-source SGI injection fix, GICv2-on-v3 MMIO synchronization fix, make the console less verbose. * x86: fix device passthrough on AMD SME. -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQEcBAABAgAGBQJariCQAAoJEL/70l94x66D6j4H/iS51f1EFRjTXbM+/D0BxpOD hTz3tiV4JNezwXkxm9FMr7e5sfONTsqZodzCeXbz9XRmb6lwkEIbaPC4G2ALJeBH V1Gpy2NsoBNCaU6Ci7SGqs1PG7wmG1tGMkzgkUoi2mwJRbA7UfnPmL1bTRESc4/w oYN5xPj1/bLYWnok2cEJevND2VUTy+/dazpm+9gZzHqeoBnRRhhNluLW11A7yi2U mSxAZrCH8X9EejIEhU8jAah4PYZ9tP8TmvUcnAPYpNgDh2JH6tlkGAYSbpadiCvf FJNhtev67uDL1jPzqQJLNo5Z04cg06M+9cYqrLD1q287LXsKs+LHXJY/AI56oIM= =YHkv -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "PPC: - fix bug leading to lost IPIs and smp_call_function_many() lockups on POWER9 ARM: - locking fix - reset fix - GICv2 multi-source SGI injection fix - GICv2-on-v3 MMIO synchronization fix - make the console less verbose. x86: - fix device passthrough on AMD SME" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Fix device passthrough when SME is active kvm: arm/arm64: vgic-v3: Tighten synchronization for guests using v2 on v3 KVM: arm/arm64: vgic: Don't populate multiple LRs with the same vintid KVM: arm/arm64: Reduce verbosity of KVM init log KVM: arm/arm64: Reset mapped IRQs on VM reset KVM: arm/arm64: Avoid vcpu_load for other vcpu ioctls than KVM_RUN KVM: arm/arm64: vgic: Add missing irq_lock to vgic_mmio_read_pending KVM: PPC: Book3S HV: Fix trap number return from __kvmppc_vcore_entry	2018-03-18 11:23:12 -07:00
John David Anglin	9ef0f88fe5	parisc: Handle case where flush_cache_range is called with no context Just when I had decided that flush_cache_range() was always called with a valid context, Helge reported two cases where the "BUG_ON(!vma->vm_mm->context);" was hit on the phantom buildd: kernel BUG at /mnt/sdb6/linux/linux-4.15.4/arch/parisc/kernel/cache.c:587! CPU: 1 PID: 3254 Comm: kworker/1:2 Tainted: G D 4.15.0-1-parisc64-smp #1 Debian 4.15.4-1+b1 Workqueue: events free_ioctx IAOQ[0]: flush_cache_range+0x164/0x168 IAOQ[1]: flush_cache_page+0x0/0x1c8 RP(r2): unmap_page_range+0xae8/0xb88 Backtrace: [<00000000404a6980>] unmap_page_range+0xae8/0xb88 [<00000000404a6ae0>] unmap_single_vma+0xc0/0x188 [<00000000404a6cdc>] zap_page_range_single+0x134/0x1f8 [<00000000404a702c>] unmap_mapping_range+0x1cc/0x208 [<0000000040461518>] truncate_pagecache+0x98/0x108 [<0000000040461624>] truncate_setsize+0x9c/0xb8 [<00000000405d7f30>] put_aio_ring_file+0x80/0x100 [<00000000405d803c>] aio_free_ring+0x8c/0x290 [<00000000405d82c0>] free_ioctx+0x80/0x180 [<0000000040284e6c>] process_one_work+0x21c/0x668 [<00000000402854c4>] worker_thread+0x20c/0x778 [<0000000040291d44>] kthread+0x2d4/0x2e0 [<0000000040204020>] end_fault_vector+0x20/0xc0 This indicates that we need to handle the no context case in flush_cache_range() as we do in flush_cache_mm(). In thinking about this, I realized that we don't need to flush the TLB when there is no context. So, I added context checks to the large flush cases in flush_cache_mm() and flush_cache_range(). The large flush case occurs frequently in flush_cache_mm() and the change should improve fork performance. The v2 version of this change removes the BUG_ON from flush_cache_page() by skipping the TLB flush when there is no context. I also added code to flush the TLB in flush_cache_mm() and flush_cache_range() when we have a context that's not current. Now all three routines handle TLB flushes in a similar manner. Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: stable@vger.kernel.org # 4.9+ Signed-off-by: Helge Deller <deller@gmx.de>	2018-03-17 11:49:39 +01:00
Linus Torvalds	8f5fd927c3	for-4.16-rc5-tag -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlqrzY4ACgkQxWXV+ddt WDtV4BAAiiv5XECNB3lIvcoFjst8E7xIJNeKKzFZh8HNm+L94zP00uqrpHkwQSUv tk9TGSYgbJJZMhw6I1rwYSKsoTkvP6NVMQZjpkJktWto9NXObKPvJzjKMdB+GYQR TWGLBaHsMhruMTOewhdVsxA5od5+LravygHaLmTDQmho8jZSSEe0qqpqlakRI9sQ +VZv1PDEI68IDMKMjOT7de7Ssq9c99BpGNXxvi5fy5kijyfgrs/BUfHBNV5r66MS y0Yw1Ab+nlLC7cE5Ql24/snDojcLoSl7f5eYt7ib+DAmsJucYQzTfUetFcrYf8IC EAmFs5Br6pbopi5M1IOxGZABNbr0qMS+zLvoF52cyka32nrmK/IMWl1s8bW0xfva gxkjouzJDOTMq28asSdyu73i9yFyLB+tEHpojECljo+K5ASzvDad34xqVRlxY6vN UJm4d76OqTfAyTRo7sQRWfLz/Cb1W+/v0mPotPWHWVyF0bUNsBgX5YKJhma3pU/z rHwhI/fqg0NTXBcNUMCosEwS2u5vnjIfiOHkDTl1Dv747S/jM0AaNEvpjDo1cOuk lsl4xKoDkQ6RpBCeFxRCW0wk1Nl/Su6RQ217xTtND+aIa4c6G7A9X8QAjj2XNAuK CktJJD7wHYSBTu1sQ+u2R/c2QEgkAxABr4NpmWTPCBNlIKEjd4A= =bXFe -----END PGP SIGNATURE----- Merge tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "There's an important revert in this pull request that needs to go to stable as it causes a corruption on big endian machines. The other fix is for FIEMAP incorrectly reporting shared extents before a sync and one fix for a crash in raid56. So far we got only one report about the BE corruption, the stable kernels were out for like a week, so hopefully the scope of the damage is low" * tag 'for-4.16-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Revert "btrfs: use proper endianness accessors for super_copy" btrfs: add missing initialization in btrfs_check_shared btrfs: Fix NULL pointer exception in find_bio_stripe	2018-03-16 13:37:42 -07:00
Linus Torvalds	8757ae23a3	Microblaze fixes for 4.16-rc6 - Use NO_BOOTMEM to fix boot issue - Fix opt lib endian dependencies -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iEYEABECAAYFAlqrr9kACgkQykllyylKDCHQZQCfVDfHfOTBoJ4vUyv1X1d6JSO4 EtUAoJ45BBNyEYvc6x5Pu/Zmx9QXqqe4 =lXoS -----END PGP SIGNATURE----- Merge tag 'microblaze-4.16-rc6' of git://git.monstr.eu/linux-2.6-microblaze Pull microblaze fixes from Michal Simek: - Use NO_BOOTMEM to fix boot issue - Fix opt lib endian dependencies * tag 'microblaze-4.16-rc6' of git://git.monstr.eu/linux-2.6-microblaze: microblaze: switch to NO_BOOTMEM microblaze: remove unused alloc_maybe_bootmem microblaze: Setup dependencies for ASM optimized lib functions	2018-03-16 13:27:34 -07:00
Borislav Petkov	bb8c13d61a	x86/microcode: Fix CPU synchronization routine Emanuel reported an issue with a hang during microcode update because my dumb idea to use one atomic synchronization variable for both rendezvous - before and after update - was simply bollocks: microcode: microcode_reload_late: late_cpus: 4 microcode: __reload_late: cpu 2 entered microcode: __reload_late: cpu 1 entered microcode: __reload_late: cpu 3 entered microcode: __reload_late: cpu 0 entered microcode: __reload_late: cpu 1 left microcode: Timeout while waiting for CPUs rendezvous, remaining: 1 CPU1 above would finish, leave and the others will still spin waiting for it to join. So do two synchronization atomics instead, which makes the code a lot more straightforward. Also, since the update is serialized and it also takes quite some time per microcode engine, increase the exit timeout by the number of CPUs on the system. That's ok because the moment all CPUs are done, that timeout will be cut short. Furthermore, panic when some of the CPUs timeout when returning from a microcode update: we can't allow a system with not all cores updated. Also, as an optimization, do not do the exit sync if microcode wasn't updated. Reported-by: Emanuel Czirai <xftroxgpx@protonmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Emanuel Czirai <xftroxgpx@protonmail.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lkml.kernel.org/r/20180314183615.17629-2-bp@alien8.de	2018-03-16 20:55:51 +01:00
Borislav Petkov	2613f36ed9	x86/microcode: Attempt late loading only when new microcode is present Return UCODE_NEW from the scanning functions to denote that new microcode was found and only then attempt the expensive synchronization dance. Reported-by: Emanuel Czirai <xftroxgpx@protonmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Emanuel Czirai <xftroxgpx@protonmail.com> Tested-by: Ashok Raj <ashok.raj@intel.com> Tested-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lkml.kernel.org/r/20180314183615.17629-1-bp@alien8.de	2018-03-16 20:55:51 +01:00
Linus Torvalds	1660a76a87	i915, amd and nouveau fixes -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJaqzIoAAoJEAx081l5xIa+8NQQAI87FM1TuO8SthUB91dtztk5 zj2BGLp8XKxhZwLxCdax5B8xZrXDR0tcNQZ1/anwMIm8mq3EyJ8XrfSBhz031UHJ dsDghvGe3ib66NGdqDQ5oFkWGwxJb5t23ze6hIRiEABC5jDCc47pKHaehWCMjcbX fkwSyQMRePtjXDy5McY3jfL4kwsNvHcWW/ZvCF2veUNsWuowLtbGalsS4DckkiXI +tbOvxuCY631BkNhoOk9ZaFlXXvgYduSWF7zOYLHmVKngO0AscatuSqOV7d4o5+y 9ZLOSBesuXhizoWaJQhkRN5QK20JebnuOuSpDfJCybaxvdu8fM7QVckjwaSxUW0Q d+UwxQU1H4k/pi6z76ScPzmdG19ozyCAP16mzL/OfPG4IF2H97vKlV1YIOp1K8X4 /iKgBEm1gCEZFgZyIt8ATTqkMxS2O5Q+emQB7yRrXMOFzDa9z2DhcC5zUyJqKbnd QoYWzCFDFIRNSgncH7CSmwBIK40H31I1sGtH6UOMbE8p8zh7bWVCssHU2IytBY1d xd97F7uPjSDDW8ASakRcEot2vuMJvgvsJJd3cZkbrwZ5hCON1eRAeWxowKTP6r5H sSm4bdMjFSPKORQP+oMzKBfmBPn1ZOtsVMOkMPsS2DrVzx6f8QmL0GYHWNYhqXNy powhJPt4kyssA0i3SoBj =lSpS -----END PGP SIGNATURE----- Merge tag 'drm-fixes-for-v4.16-rc6' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "i915, amd and nouveau fixes. i915: - backlight fix for some panels - pm fix - fencing fix - some GVT fixes amdgpu: - backlight fix across suspend/resume - object destruction ordering issue fix - displayport fix nouveau: - two backlight fixes - fix for some lockups Pretty quiet week, seems like everyone was fixing backlights" * tag 'drm-fixes-for-v4.16-rc6' of git://people.freedesktop.org/~airlied/linux: drm/nouveau/bl: fix backlight regression drm/nouveau/bl: Fix oops on driver unbind drm/nouveau/mmu: ALIGN_DOWN correct variable drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field drm/i915/gvt: Correct the privilege shadow batch buffer address drm/amdgpu/dce: Don't turn off DP sink when disconnected drm/amdgpu: save/restore backlight level in legacy dce code drm/radeon: fix prime teardown order drm/amdgpu: fix prime teardown order drm/i915: Kick the rps worker when changing the boost frequency drm/i915: Only prune fences after wait-for-all drm/i915: Enable VBT based BL control for DP drm/i915/gvt: keep oa config in shadow ctx drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio	2018-03-16 12:55:00 -07:00
David Sterba	093e037ca8	Revert "btrfs: use proper endianness accessors for super_copy" This reverts commit `3c181c12c4`. The offending patch was merged in 4.16-rc4 and was promptly applied to stable kernels 4.14.25 and 4.15.8. The patch causes a corruption in several superblock items on big-endian machines because of messed up endianity conversions. The damage is manually repairable. A filesystem cannot be mounted again after it has been unmounted once. We do a full revert and not a fixup so stable can pick that patch ASAP. Fixes: `3c181c12c4` ("btrfs: use proper endianness accessors for super_copy") Link: https://lkml.kernel.org/r/1521139304@msgid.manchmal.in-ulm.de CC: stable@vger.kernel.org # 4.14+ Reported-by: Christoph Biedl <linux-kernel.bfrz@manchmal.in-ulm.de> Signed-off-by: David Sterba <dsterba@suse.com>	2018-03-16 14:49:44 +01:00
Tom Lendacky	daaf216c06	KVM: x86: Fix device passthrough when SME is active When using device passthrough with SME active, the MMIO range that is mapped for the device should not be mapped encrypted. Add a check in set_spte() to insure that a page is not mapped encrypted if that page is a device MMIO page as indicated by kvm_is_mmio_pfn(). Cc: <stable@vger.kernel.org> # 4.14.x- Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-03-16 14:32:23 +01:00
Rob Herring	101646a24a	microblaze: switch to NO_BOOTMEM Microblaze doesn't set CONFIG_NO_BOOTMEM and so memblock_virt_alloc() doesn't work for CONFIG_HAVE_MEMBLOCK && !CONFIG_NO_BOOTMEM. Similar change was already done by others architectures "ARM: mm: Remove bootmem code and switch to NO_BOOTMEM" (sha1: `84f452b1e8`) or "openrisc: Consolidate setup to use memblock instead of bootmem" (sha1: `266c7fad15`) or "parisc: Drop bootmem and switch to memblock" (sha1: `4fe9e1d957`) or "powerpc: Remove bootmem allocator" (sha1: `10239733ee`) or "s390/mm: Convert bootmem to memblock" (sha1: `50be634507`) or "sparc64: Convert over to NO_BOOTMEM." (sha1: `625d693e97`) or "xtensa: drop sysmem and switch to memblock" (sha1: `0e46c1115f`) Issue was introduced by: "of/fdt: use memblock_virt_alloc for early alloc" (sha1: `0fa1c57934`) Signed-off-by: Rob Herring <robh@kernel.org> Tested-by: Alvaro Gamez Machado <alvaro.gamez@hazent.com> Tested-by: Michal Simek <michal.simek@xilinx.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com>	2018-03-16 12:51:27 +01:00
Rob Herring	cd4dfee6a8	microblaze: remove unused alloc_maybe_bootmem alloc_maybe_bootmem is unused, so remove it. Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Michal Simek <michal.simek@xilinx.com>	2018-03-16 12:51:27 +01:00
Michal Simek	18ffc0cce4	microblaze: Setup dependencies for ASM optimized lib functions The patch: "microblaze: Setup proper dependency for optimized lib functions" (sha1: `7b6ce52be3`) didn't setup all dependencies properly. Optimized lib functions in C are also present for little endian and optimized library functions in assembler are implemented only for big endian version. Reported-by: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Michal Simek <michal.simek@xilinx.com>	2018-03-16 12:51:26 +01:00
Alexander Sergeyev	e3b3121fa8	x86/speculation: Remove Skylake C2 from Speculation Control microcode blacklist In accordance with Intel's microcode revision guidance from March 6 MCU rev 0xc2 is cleared on both Skylake H/S and Skylake Xeon E3 processors that share CPUID 506E3. Signed-off-by: Alexander Sergeyev <sergeev917@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Jia Zhang <qianyue.zj@alibaba-inc.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Kyle Huey <me@kylehuey.com> Cc: David Woodhouse <dwmw@amazon.co.uk> Link: https://lkml.kernel.org/r/20180313193856.GA8580@localhost.localdomain	2018-03-16 12:33:11 +01:00
Dave Airlie	3a1b5de36f	Only GVT fixes: - Two warnings fix for runtime pm and usr copy (Xiong, Zhenyu) - OA context fix for vGPU profiling (Min) - privilege batch buffer reloc fix (Fred) -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJaqwRYAAoJEPpiX2QO6xPKZT0H/30rwEPNL0Z9LMrZY/WC4Yr8 0Hbc7jn0pYgCV+Cm8E1OPwQv2lqZMx5VAPV7ggQqaVjQam629NVODVUVNyMorQ7l uy3kbi3vAcazuvaApUchMDppfl919FP5QXFKaEm8HU3C8oROaKO6lZKc6OJFd1Bu eNLwg3FqI0JXRUqfggzyaqtpV8bMXzLVeQh98wlGbpkEF0yvXlTPProLJx4WX2kq /8OULi2k7g9QxOd8S9l3TdSHpOJKPzoDecvGW6WFM+0q5POs+Ybk69yU32irt/Fk 2dNbqn8GgPXTkFMaBzlBFqq6Kgh4y1b6eQDjQC1oZGOyjK/HwtNteZHREEzg15A= =fg6j -----END PGP SIGNATURE----- Merge tag 'drm-intel-fixes-2018-03-15' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes Only GVT fixes: - Two warnings fix for runtime pm and usr copy (Xiong, Zhenyu) - OA context fix for vGPU profiling (Min) - privilege batch buffer reloc fix (Fred) * tag 'drm-intel-fixes-2018-03-15' of git://anongit.freedesktop.org/drm/drm-intel: drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field drm/i915/gvt: Correct the privilege shadow batch buffer address drm/i915/gvt: keep oa config in shadow ctx drm/i915/gvt: Add runtime_pm_get/put into gvt_switch_mmio	2018-03-16 12:51:35 +10:00
Dave Airlie	d4487b57a4	Merge branch 'linux-4.16' of git://github.com/skeggsb/linux into drm-fixes nouveau regression fixes. * 'linux-4.16' of git://github.com/skeggsb/linux: drm/nouveau/bl: fix backlight regression drm/nouveau/bl: Fix oops on driver unbind drm/nouveau/mmu: ALIGN_DOWN correct variable	2018-03-16 12:06:17 +10:00
Linus Torvalds	df09348f78	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: - backport-friendly part of lock_parent() race fix - a fix for an assumption in the heurisic used by path_connected() that is not true on NFS - livelock fixes for d_alloc_parallel() * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: Teach path_connected to handle nfs filesystems with multiple roots. fs: dcache: Use READ_ONCE when accessing i_dir_seq fs: dcache: Avoid livelock between d_alloc_parallel and __d_add lock_parent() needs to recheck if dentry got __dentry_kill'ed under it	2018-03-15 18:57:14 -07:00
Karol Herbst	9e75dc61ea	drm/nouveau/bl: fix backlight regression Fixes: `3c66c87dc9` ("drm/nouveau/disp: remove hw-specific customisation of output paths") Suggested-by: Ben Skeggs <skeggsb@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-03-16 11:55:05 +10:00
Lukas Wunner	76f2e2bc62	drm/nouveau/bl: Fix oops on driver unbind Unbinding nouveau on a dual GPU MacBook Pro oopses because we iterate over the bl_connectors list in nouveau_backlight_exit() but skipped initializing it in nouveau_backlight_init(). Stacktrace for posterity: BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 IP: nouveau_backlight_exit+0x2b/0x70 [nouveau] nouveau_display_destroy+0x29/0x80 [nouveau] nouveau_drm_unload+0x65/0xe0 [nouveau] drm_dev_unregister+0x3c/0xe0 [drm] drm_put_dev+0x2e/0x60 [drm] nouveau_drm_device_remove+0x47/0x70 [nouveau] pci_device_remove+0x36/0xb0 device_release_driver_internal+0x157/0x220 driver_detach+0x39/0x70 bus_remove_driver+0x51/0xd0 pci_unregister_driver+0x2a/0xa0 nouveau_drm_exit+0x15/0xfb0 [nouveau] SyS_delete_module+0x18c/0x290 system_call_fast_compare_end+0xc/0x6f Fixes: `b53ac1ee12` ("drm/nouveau/bl: Do not register interface if Apple GMUX detected") Cc: stable@vger.kernel.org # v4.10+ Cc: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Ben Skeggs <bskeggs@redhat.com>	2018-03-16 11:55:05 +10:00
Māris Nartišs	da5e45e619	drm/nouveau/mmu: ALIGN_DOWN correct variable Commit 7110c89bb8852ff8b0f88ce05b332b3fe22bd11e ("mmu: swap out round for ALIGN") replaced two calls to round/rounddown with ALIGN/ALIGN_DOWN, but erroneously applied ALIGN_DOWN to a different variable (addr) and left intended variable (tail) not rounded/ALIGNed. As a result screen corruption, X lockups are observable. An example of kernel log of affected system with NV98 card where it was bisected: nouveau 0000:01:00.0: gr: TRAP_M2MF 00000002 [IN] nouveau 0000:01:00.0: gr: TRAP_M2MF 00320951 400007c0 00000000 04000000 nouveau 0000:01:00.0: gr: 00200000 [] ch 1 [000fbbe000 DRM] subc 4 class 5039 mthd 0100 data 00000000 nouveau 0000:01:00.0: fb: trapped read at 0040000000 on channel 1 [0fbbe000 DRM] engine 00 [PGRAPH] client 03 [DISPATCH] subclient 04 [M2M_IN] reason 00000006 [NULL_DMAOBJ] Fixes bug 105173 ("[MCP79][Regression] Unhandled NULL pointer dereference in nvkm_object_unmap since kernel 4.15") https://bugs.freedesktop.org/show_bug.cgi?id=105173 Fixes: 7110c89bb885 ("mmu: swap out round for ALIGN ") Tested-by: Pierre Moreau <pierre.morrow@free.fr> Reviewed-by: Pierre Moreau <pierre.morrow@free.fr> Signed-off-by: Maris Nartiss <maris.nartiss@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Cc: stable@vger.kernel.org # v4.15+	2018-03-16 11:55:04 +10:00
Eric W. Biederman	95dd77580c	fs: Teach path_connected to handle nfs filesystems with multiple roots. On nfsv2 and nfsv3 the nfs server can export subsets of the same filesystem and report the same filesystem identifier, so that the nfs client can know they are the same filesystem. The subsets can be from disjoint directory trees. The nfsv2 and nfsv3 filesystems provides no way to find the common root of all directory trees exported form the server with the same filesystem identifier. The practical result is that in struct super s_root for nfs s_root is not necessarily the root of the filesystem. The nfs mount code sets s_root to the root of the first subset of the nfs filesystem that the kernel mounts. This effects the dcache invalidation code in generic_shutdown_super currently called shrunk_dcache_for_umount and that code for years has gone through an additional list of dentries that might be dentry trees that need to be freed to accomodate nfs. When I wrote path_connected I did not realize nfs was so special, and it's hueristic for avoiding calling is_subdir can fail. The practical case where this fails is when there is a move of a directory from the subtree exposed by one nfs mount to the subtree exposed by another nfs mount. This move can happen either locally or remotely. With the remote case requiring that the move directory be cached before the move and that after the move someone walks the path to where the move directory now exists and in so doing causes the already cached directory to be moved in the dcache through the magic of d_splice_alias. If someone whose working directory is in the move directory or a subdirectory and now starts calling .. from the initial mount of nfs (where s_root == mnt_root), then path_connected as a heuristic will not bother with the is_subdir check. As s_root really is not the root of the nfs filesystem this heuristic is wrong, and the path may actually not be connected and path_connected can fail. The is_subdir function might be cheap enough that we can call it unconditionally. Verifying that will take some benchmarking and the result may not be the same on all kernels this fix needs to be backported to. So I am avoiding that for now. Filesystems with snapshots such as nilfs and btrfs do something similar. But as the directory tree of the snapshots are disjoint from one another and from the main directory tree rename won't move things between them and this problem will not occur. Cc: stable@vger.kernel.org Reported-by: Al Viro <viro@ZenIV.linux.org.uk> Fixes: `397d425dc2` ("vfs: Test for and handle paths that are unreachable from their mnt_root") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-03-15 18:48:38 -04:00
Rodrigo Vivi	05b429a8ee	Merge tag 'gvt-fixes-2018-03-15' of https://github.com/intel/gvt-linux into drm-intel-fixes gvt-fixes-2018-03-15 - Two warnings fix for runtime pm and usr copy (Xiong, Zhenyu) - OA context fix for vGPU profiling (Min) - privilege batch buffer reloc fix (Fred) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180315100023.5n5a74afky6qinoh@zhen-hp.sh.intel.com	2018-03-15 15:37:57 -07:00
David S. Miller	cfb61b5e3e	sparc64: Fix regression in pmdp_invalidate(). pmdp_invalidate() was changed to update the pmd atomically (to not lose dirty/access bits) and return the original pmd value. However, in doing so, we lost a lot of the essential work that set_pmd_at() does, namely to update hugepage mapping counts and queuing up the batched TLB flush entry. Thus we were not flushing entries out of the TLB when making such PMD changes. Fix this by abstracting the accounting work of set_pmd_at() out into a separate function, and call it from pmdp_establish(). Fixes: `a8e654f01c` ("sparc64: update pmdp_invalidate() to return old pmd value") Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-15 14:18:00 -07:00
Paolo Bonzini	52be7a467e	Fix for PPC KVM for 4.16 - Fix bug leading to lost IPIs on POWER9 and hence to other CPUs reporting lockups in smp_call_function_many(). -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJaqNxFAAoJEJ2a6ncsY3GfmwQH/3wz36kFHufskFhqtr3kQKYS /LFsydZKF/8puR8CobVcvqRX/KP/WpjTvpC4GhYrto7IVPJBpuJuozSY5LDLVg9s kw5uNQeZREFjua2Lo78/YUh+wN7Xx3LtBC/ass6QOM51dGnfeUpSiSuzGQhMrpaf CaDVT/0M1zPcQqDvQSinsTJm5xNTJ2cO6Q2tTFtHOWQGBKB1uGxexBx9NAEO71vh 6KOgU9uIW83Vy2tubOEN6vaDEOUtm6MOwaTbFQo3Dvt7VPDoUmU099K0+EI8UBDF /PQ/yXWaAkSrZdyDFsLWONd9jX0LrvhdNOw1bh46fPdr+SCTNp9pFRCcq3P+MhI= =44ey -----END PGP SIGNATURE----- Merge tag 'kvm-ppc-fixes-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc into kvm-master Fix for PPC KVM for 4.16 - Fix bug leading to lost IPIs on POWER9 and hence to other CPUs reporting lockups in smp_call_function_many().	2018-03-15 21:57:26 +01:00
Paolo Bonzini	bb9b4dbe0d	kvm/arm fixes for 4.16, take 2 - Peace of mind locking fix in vgic_mmio_read_pending - Allow hw-mapped interrupts to be reset when the VM resets - Fix GICv2 multi-source SGI injection - Fix MMIO synchronization for GICv2 on v3 emulation - Remove excess verbosity on the console -----BEGIN PGP SIGNATURE----- iQJJBAABCAAzFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAlqqp/cVHG1hcmMuenlu Z2llckBhcm0uY29tAAoJECPQ0LrRPXpDAGkP/2LMhFN561PKlqgu5V4hFvowJiXb Gbb/qi095vtDGccbKmJKAZp3jyOM2oJEMUkx5RBYglWjW0mxb3zPAAxhldXiqv/2 CrOGGlS/FwfyIjCt7870pltDOIgRmk8Fv/MyQjjGKF6VAghd6yVHIZiOUjiriUyz 6hNyc2znLm0tBqm4j3HTXKHpD23YseW387pQoeQ03/WiXiZ60O3e3k0yppXO81qE b7TGT4Bz04mxlAISZVZeTmG7P7P4ej6+NhOH+1kxacseLzHdECPBA0JRcwRpfLkP 5JFodUOX7/KHpvpMLUxRNRnLBei9WUL4o2LAEV0qDaj7nlAud0kKUm22RLaVKDm+ 8FSUQ12XKqnZsRrl6IizU1oAb1I1iV3j9HF5iNf3mk9AO27REGk0b8fDyRzDj300 xpySgvIgA+f+EyY+3ve0AmEUa5QKz/WLuik2ZCqpVOuufrO8XpS+zjn1L1tzTlkR 95EahDA7enutw47G0uWtxoPMeU4HTZS/CAiFwUbq8BEK7T3Rct7UySPLwgeYBoji MUlCRhPyAANCJmtO6rpOS3htkQ3XkkO1DVIGLuWC5Zl00W1T5I5+VRrVL1YI4v3O d2ui9r5X5Vmg4OUdhr2D9fXgPWWKEbqD90jv40rGLsMl0g/IwrC+o2VxgYxSeu5x CLUYILwEA5NDZSof =iyYE -----END PGP SIGNATURE----- Merge tag 'kvm-arm-fixes-for-v4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master kvm/arm fixes for 4.16, take 2 - Peace of mind locking fix in vgic_mmio_read_pending - Allow hw-mapped interrupts to be reset when the VM resets - Fix GICv2 multi-source SGI injection - Fix MMIO synchronization for GICv2 on v3 emulation - Remove excess verbosity on the console	2018-03-15 21:45:37 +01:00
Linus Torvalds	e2c15aff5f	sound fixes for 4.16-rc6 A series of small fixes in ASoC, HD-audio and core stuff: - A UAF fix in ALSA PCM core - Yet more hardening for ALSA sequencer - A regression fix for the previous HD-audio power_save option change - Various ASoC codec fixes (sgtl5000, rt5651, hdmi-codec, wm_adsp) - Minor ASoC platform fixes (AMD ACP, sun4i) -----BEGIN PGP SIGNATURE----- iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAlqqprkOHHRpd2FpQHN1 c2UuZGUACgkQLtJE4w1nLE+Fow//b3ud9UpNKYArnADF+jt5rhXHKPZdQf4W9Thy RVCv4twJzB2kT7gxw/x5dghMBYYWbHtOMqHtanwqIu9YehQ6XiKyRsAQrD5PFAzM 27h+bNNBIEAJ+jSuJS8pp2T6mQ8KTyMt8lCpsA5yhuBIWahraTJ76tk3CmkXrKwe Stk0RxSDjH4TVbPSHbBc064freBuJcLvDNXSKMxxYc/0mJ2qatvdcKevDeISjvwB yQkDiyHdfpOugYxMVx0H8T1bIM1esUg19+CF9E8/JuKKibKRkOPHPCLjPI8sXVSV zYpfbH5SZT69H7VQRlX1y9rqQycTeMyZyDZBQlTx2KoDh+KnQJ6JKH9NoKg9l5o0 GPr6zMBaIxCI4mAJd5VMOfMCQVGa1pU6KD7dnRbBtFDlSsO93IGf1EMi9L+Yh2KC GV6EJxo82h7JPZotYTQEh9C4OvStIKxdm8kmy3Yd1N7MOAy0TpemC41rYp3KAaET W2GrT39c1uyPMRU27GVb08X64itDIpAFGvA3sEaT7TGoO3mAv30tfN4SUu8Ls3n/ 0dCnKN68yAGISogz6fbrvSz2tQoZQf1ACz548zony753tOvmxndtvgPIvGDDa/j3 z/FYfoCMTF/KijQhEKK7pOu5lwEaOd66VC3XZgyYgCON0SZbzcStqQF33kGuIlhT BcmjHOI= =mCc/ -----END PGP SIGNATURE----- Merge tag 'sound-4.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A series of small fixes in ASoC, HD-audio and core stuff: - a UAF fix in ALSA PCM core - yet more hardening for ALSA sequencer - a regression fix for the previous HD-audio power_save option change - various ASoC codec fixes (sgtl5000, rt5651, hdmi-codec, wm_adsp) - minor ASoC platform fixes (AMD ACP, sun4i)" * tag 'sound-4.16-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Revert power_save option default value ALSA: pcm: Fix UAF in snd_pcm_oss_get_formats() ALSA: seq: Clear client entry before deleting else at closing ALSA: seq: Fix possible UAF in snd_seq_check_queue() ASoC: amd: 16bit resolution support for i2s sp instance ASoC: wm_adsp: For TLV controls only register TLV get/set ASoC: sun4i-i2s: Fix RX slot number of SUN8I ASoC: hdmi-codec: Fix module unloading caused kernel crash ASoC: rt5651: Fix regcache sync errors on resume ASoC: sgtl5000: Fix suspend/resume MAINTAINERS: Add myself as sgtl5000 maintainer ASoC: samsung: Add the DT binding files entry to MAINTAINERS sgtl5000: change digital_mute policy	2018-03-15 11:07:35 -07:00
Linus Torvalds	667058ae60	- A stable DM multipath fix to restore ability to pass integrity data - 2 DM multipath fixes for a fix that was merged into 4.15-rc5 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJaqoz6AAoJEMUj8QotnQNaNtEH/37YSs5yBXSzbsvRQuVsht1G C5qXrTB22JczSBNBfC+i0h5o9bXSm5brL/lOcP0qEokXnhEFgVK/nfcJmDRErUuJ 7TJhvYPptrFJ0hR7uMzqXxqp5ZDY+/mmzj69s0ZGZyhVacvO/CDW9CPUg7vgROp2 lEUohb3vi1ZaaHavbb/di4/m88+niJw5Yi/TBV24O7UxixQ3F1aeC5UENbfOt9re 5CauY3OcosS0Z+Kr8rmOzy2l/yBd2bYl8YGf4sJHWp4hmPec9PXUBOAK9wca9ouA UNVh/2SBddjmM8wsDzUGsJJyZW+YN2eh2fzs9eJIDfrOFt4BhueUVUzKcxXo+RE= =XErm -----END PGP SIGNATURE----- Merge tag 'for-4.16/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - a stable DM multipath fix to restore ability to pass integrity data - two DM multipath fixes for a fix that was merged into 4.16-rc5 * tag 'for-4.16/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm mpath: fix passing integrity data dm mpath: eliminate need to use scsi_device_from_queue dm mpath: fix uninitialized 'pg_init_wait' waitqueue_head NULL pointer	2018-03-15 11:04:46 -07:00
Zhenyu Wang	850555d1d3	drm/i915/gvt: fix user copy warning by whitelist workload rb_tail field This is to fix warning got as: [ 6730.476938] ------------[ cut here ]------------ [ 6730.476979] Bad or missing usercopy whitelist? Kernel memory exposure attempt detected from SLAB object 'gvt-g_vgpu_workload' (offset 120, size 4)! [ 6730.477021] WARNING: CPU: 2 PID: 441 at mm/usercopy.c:81 usercopy_warn+0x7e/0xa0 [ 6730.477042] Modules linked in: tun(E) bridge(E) stp(E) llc(E) kvmgt(E) x86_pkg_temp_thermal(E) vfio_mdev(E) intel_powerclamp(E) mdev(E) coretemp(E) vfio_iommu_type1(E) vfio(E) kvm_intel(E) kvm(E) hid_generic(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) usbhid(E) i915(E) crc32c_intel(E) hid(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) aes_x86_64(E) crypto_simd(E) cryptd(E) glue_helper(E) intel_cstate(E) idma64(E) evdev(E) virt_dma(E) iTCO_wdt(E) intel_uncore(E) intel_rapl_perf(E) intel_lpss_pci(E) sg(E) shpchp(E) mei_me(E) pcspkr(E) iTCO_vendor_support(E) intel_lpss(E) intel_pch_thermal(E) prime_numbers(E) mei(E) mfd_core(E) video(E) acpi_pad(E) button(E) binfmt_misc(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) fscrypto(E) sd_mod(E) e1000e(E) xhci_pci(E) sdhci_pci(E) [ 6730.477244] ptp(E) cqhci(E) xhci_hcd(E) pps_core(E) sdhci(E) mmc_core(E) i2c_i801(E) usbcore(E) thermal(E) fan(E) [ 6730.477276] CPU: 2 PID: 441 Comm: gvt workload 0 Tainted: G E 4.16.0-rc1-gvt-staging-0213+ #127 [ 6730.477303] Hardware name: /NUC6i5SYB, BIOS SYSKLi35.86A.0039.2016.0316.1747 03/16/2016 [ 6730.477326] RIP: 0010:usercopy_warn+0x7e/0xa0 [ 6730.477340] RSP: 0018:ffffba6301223d18 EFLAGS: 00010286 [ 6730.477355] RAX: 0000000000000000 RBX: ffff8f41caae9838 RCX: 0000000000000006 [ 6730.477375] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff8f41dad166f0 [ 6730.477395] RBP: 0000000000000004 R08: 0000000000000576 R09: 0000000000000000 [ 6730.477415] R10: ffffffffb1293fb2 R11: 00000000ffffffff R12: 0000000000000001 [ 6730.477447] R13: ffff8f41caae983c R14: ffff8f41caae9838 R15: 00007f183ca2b000 [ 6730.477467] FS: 0000000000000000(0000) GS:ffff8f41dad00000(0000) knlGS:0000000000000000 [ 6730.477489] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6730.477506] CR2: 0000559462817291 CR3: 000000028b46c006 CR4: 00000000003626e0 [ 6730.477526] Call Trace: [ 6730.477537] __check_object_size+0x9c/0x1a0 [ 6730.477562] __kvm_write_guest_page+0x45/0x90 [kvm] [ 6730.477585] kvm_write_guest+0x46/0x80 [kvm] [ 6730.477599] kvmgt_rw_gpa+0x9b/0xf0 [kvmgt] [ 6730.477642] workload_thread+0xa38/0x1040 [i915] [ 6730.477659] ? do_wait_intr_irq+0xc0/0xc0 [ 6730.477673] ? finish_wait+0x80/0x80 [ 6730.477707] ? clean_workloads+0x120/0x120 [i915] [ 6730.477722] kthread+0x111/0x130 [ 6730.477733] ? _kthread_create_worker_on_cpu+0x60/0x60 [ 6730.477750] ? exit_to_usermode_loop+0x6f/0xb0 [ 6730.477766] ret_from_fork+0x35/0x40 [ 6730.477777] Code: 48 c7 c0 20 e3 25 b1 48 0f 44 c2 41 50 51 41 51 48 89 f9 49 89 f1 4d 89 d8 4c 89 d2 48 89 c6 48 c7 c7 78 e3 25 b1 e8 b2 bc e4 ff <0f> ff 48 83 c4 18 c3 48 c7 c6 09 d0 26 b1 49 89 f1 49 89 f3 eb [ 6730.477849] ---[ end trace cae869c1c323e45a ]--- By whitelist guest page write from workload struct allocated from kmem cache. Reviewed-by: Hang Yuan <hang.yuan@linux.intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com> (cherry picked from commit 5627705406874df57fdfad3b4e0c9aedd3b007df)	2018-03-15 15:07:22 +08:00
fred gao	ef75c68586	drm/i915/gvt: Correct the privilege shadow batch buffer address Once the ring buffer is copied to ring_scan_buffer and scanned, the shadow batch buffer start address is only updated into ring_scan_buffer, not the real ring address allocated through intel_ring_begin in later copy_workload_to_ring_buffer. This patch is only to set the right shadow batch buffer address from Ring buffer, not include the shadow_wa_ctx. v2: - refine some comments. (Zhenyu) v3: - fix typo in title. (Zhenyu) v4: - remove the unnecessary comments. (Zhenyu) - add comments in bb_start_cmd_va update. (Zhenyu) Fixes: `0a53bc07f0` ("drm/i915/gvt: Separate cmd scan from request allocation") Cc: stable@vger.kernel.org # v4.15 Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Cc: Yulei Zhang <yulei.zhang@intel.com> Signed-off-by: fred gao <fred.gao@intel.com> Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>	2018-03-15 15:06:26 +08:00
Linus Torvalds	0aa3fdb8b3	SCSI fixes on 20180314 This is four patches, consisting of one regression from the merge window (qla2xxx) one lonstanding memory leak (sd_zbc) one event queue mislabelling which we want to eliminate to discourage the pattern (mpt3sas) and one behaviour change because re-reading the partition table shouldn't clear the ro flag. Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com> -----BEGIN PGP SIGNATURE----- iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCWqmwAyYcamFtZXMuYm90 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishXztAQCYs0s/ oysGLnl2qkuSC8u7vzzLURfQ6l2MGq4ic8Y/mQD/ZgTvf9eGj5OhARcRk29D3XRJ zDY3KbkNIajadXlN3LY= =eu2i -----END PGP SIGNATURE----- Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is four patches, consisting of one regression from the merge window (qla2xxx), one long-standing memory leak (sd_zbc), one event queue mislabelling which we want to eliminate to discourage the pattern (mpt3sas), and one behaviour change because re-reading the partition table shouldn't clear the ro flag" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: sd: Keep disk read-only when re-reading partition scsi: qla2xxx: Fix crashes in qla2x00_probe_one on probe failure scsi: sd_zbc: Fix potential memory leak scsi: mpt3sas: Do not mark fw_event workqueue as WQ_MEM_RECLAIM	2018-03-14 17:02:49 -07:00

1 2 3 4 5 ...

738765 Commits