linux_dsm_epyc7002/kernel/sched
Linus Torvalds 9dcb8b685f mm: remove per-zone hashtable of bitlock waitqueues
The per-zone waitqueues exist because of a scalability issue with the
page waitqueues on some NUMA machines, but it turns out that they hurt
normal loads, and now with the vmalloced stacks they also end up
breaking gfs2 that uses a bit_wait on a stack object:

     wait_on_bit(&gh->gh_iflags, HIF_WAIT, TASK_UNINTERRUPTIBLE)

where 'gh' can be a reference to the local variable 'mount_gh' on the
stack of fill_super().

The reason the per-zone hash table breaks for this case is that there is
no "zone" for virtual allocations, and trying to look up the physical
page to get at it will fail (with a BUG_ON()).

It turns out that I actually complained to the mm people about the
per-zone hash table for another reason just a month ago: the zone lookup
also hurts the regular use of "unlock_page()" a lot, because the zone
lookup ends up forcing several unnecessary cache misses and generates
horrible code.

As part of that earlier discussion, we had a much better solution for
the NUMA scalability issue - by just making the page lock have a
separate contention bit, the waitqueue doesn't even have to be looked at
for the normal case.

Peter Zijlstra already has a patch for that, but let's see if anybody
even notices.  In the meantime, let's fix the actual gfs2 breakage by
simplifying the bitlock waitqueues and removing the per-zone issue.

Reported-by: Andreas Gruenbacher <agruenba@redhat.com>
Tested-by: Bob Peterson <rpeterso@redhat.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-27 09:27:57 -07:00
..
auto_group.c sched/core: Move the sched_to_prio[] arrays out of line 2015-12-04 10:34:46 +01:00
auto_group.h sched, timer: Convert usages of ACCESS_ONCE() in the scheduler to READ_ONCE()/WRITE_ONCE() 2015-05-08 12:11:32 +02:00
clock.c sched/clock: Make local_clock()/cpu_clock() inline 2016-04-13 12:25:22 +02:00
completion.c sched/completion: Serialize completion_done() with complete() 2015-02-18 14:27:40 +01:00
core.c mm: remove per-zone hashtable of bitlock waitqueues 2016-10-27 09:27:57 -07:00
cpuacct.c sched/cpuacct: Introduce cpuacct.usage_all to show all CPU stats together 2016-07-09 13:56:15 +02:00
cpuacct.h sched/cpuacct: Simplify the cpuacct code 2016-03-21 11:00:28 +01:00
cpudeadline.c sched/deadline: Split cpudl_set() into cpudl_set() and cpudl_clear() 2016-09-05 13:29:43 +02:00
cpudeadline.h sched/deadline: Split cpudl_set() into cpudl_set() and cpudl_clear() 2016-09-05 13:29:43 +02:00
cpufreq_schedutil.c cpufreq: schedutil: Add iowait boosting 2016-09-13 23:36:01 +02:00
cpufreq.c cpufreq / sched: Pass flags to cpufreq_update_util() 2016-08-16 22:14:55 +02:00
cpupri.c sched/core: Use tsk_cpus_allowed() instead of accessing ->cpus_allowed 2016-05-12 09:55:35 +02:00
cpupri.h sched/cpupri: Remove unnecessary definitions in cpupri.h 2014-11-16 10:58:59 +01:00
cputime.c sched/irqtime: Consolidate irqtime flushing code 2016-09-30 11:46:41 +02:00
deadline.c Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-10-03 13:39:00 -07:00
debug.c Merge branch 'for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2016-10-14 12:18:50 -07:00
fair.c sched/fair: Fix incorrect task group ->load_avg 2016-10-19 15:04:47 +02:00
features.h sched/fair: Convert arch_scale_cpu_capacity() from weak function to #define 2015-09-13 09:52:55 +02:00
idle_task.c sched/core: Rewrite and improve select_idle_siblings() 2016-09-30 11:03:09 +02:00
idle.c nmi_backtrace: generate one-line reports for idle cpus 2016-10-07 18:46:30 -07:00
loadavg.c sched/core: Correct off by one bug in load migration calculation 2016-07-13 14:58:20 +02:00
Makefile cpufreq: schedutil: New governor based on scheduler utilization data 2016-04-02 01:09:12 +02:00
rt.c cpufreq / sched: Pass runqueue pointer to cpufreq_update_util() 2016-08-16 22:16:03 +02:00
sched.h Merge branch 'x86-asm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2016-10-03 16:13:28 -07:00
stats.c sched: use %*pb[l] to print bitmaps including cpumasks and nodemasks 2015-02-13 21:21:37 -08:00
stats.h sched/debug: Rename 'schedstat_val()' -> 'schedstat_val_or_zero()' 2016-09-05 13:29:46 +02:00
stop_task.c locking/lockdep, sched/core: Implement a better lock pinning scheme 2016-05-05 09:23:59 +02:00
swait.c wait.[ch]: Introduce the simple waitqueue (swait) implementation 2016-02-25 11:27:16 +01:00
wait.c mm: remove per-zone hashtable of bitlock waitqueues 2016-10-27 09:27:57 -07:00