linux_dsm_epyc7002/include
Mike Kravetz 1b426bac66 hugetlb: use same fault hash key for shared and private mappings
hugetlb uses a fault mutex hash table to prevent page faults of the
same pages concurrently.  The key for shared and private mappings is
different.  Shared keys off address_space and file index.  Private keys
off mm and virtual address.  Consider a private mappings of a populated
hugetlbfs file.  A fault will map the page from the file and if needed
do a COW to map a writable page.

Hugetlbfs hole punch uses the fault mutex to prevent mappings of file
pages.  It uses the address_space file index key.  However, private
mappings will use a different key and could race with this code to map
the file page.  This causes problems (BUG) for the page cache remove
code as it expects the page to be unmapped.  A sample stack is:

page dumped because: VM_BUG_ON_PAGE(page_mapped(page))
kernel BUG at mm/filemap.c:169!
...
RIP: 0010:unaccount_page_cache_page+0x1b8/0x200
...
Call Trace:
__delete_from_page_cache+0x39/0x220
delete_from_page_cache+0x45/0x70
remove_inode_hugepages+0x13c/0x380
? __add_to_page_cache_locked+0x162/0x380
hugetlbfs_fallocate+0x403/0x540
? _cond_resched+0x15/0x30
? __inode_security_revalidate+0x5d/0x70
? selinux_file_permission+0x100/0x130
vfs_fallocate+0x13f/0x270
ksys_fallocate+0x3c/0x80
__x64_sys_fallocate+0x1a/0x20
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9

There seems to be another potential COW issue/race with this approach
of different private and shared keys as noted in commit 8382d914eb
("mm, hugetlb: improve page-fault scalability").

Since every hugetlb mapping (even anon and private) is actually a file
mapping, just use the address_space index key for all mappings.  This
results in potentially more hash collisions.  However, this should not
be the common case.

Link: http://lkml.kernel.org/r/20190328234704.27083-3-mike.kravetz@oracle.com
Link: http://lkml.kernel.org/r/20190412165235.t4sscoujczfhuiyt@linux-r8p5
Fixes: b5cec28d36 ("hugetlbfs: truncate_hugepages() takes a range of pages")
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Davidlohr Bueso <dbueso@suse.de>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-14 09:47:48 -07:00
..
acpi Driver core/kobject patches for 5.2-rc1 2019-05-07 13:01:40 -07:00
asm-generic hugetlb: allow to free gigantic pages regardless of the configuration 2019-05-14 09:47:47 -07:00
clocksource
crypto crypto: shash - remove shash_desc::flags 2019-04-25 15:38:12 +08:00
drm drm pull request for 5.2 2019-05-08 21:35:19 -07:00
dt-bindings We have a couple new features and changes in the core clk framework this time 2019-05-09 14:50:09 -07:00
keys
kvm
linux hugetlb: use same fault hash key for shared and private mappings 2019-05-14 09:47:48 -07:00
math-emu
media drm pull request for 5.2 2019-05-08 21:35:19 -07:00
memory
misc ocxl: Provide global MMIO accessors for external drivers 2019-05-03 02:55:02 +10:00
net net/tcp: use deferred jump label for TCP acked data hook 2019-05-09 11:13:57 -07:00
pcmcia
ras
rdma RDMA: Add EFA related definitions 2019-05-06 13:47:50 -03:00
scsi scsi: libsas: Inject revalidate event for root port event 2019-04-15 18:55:00 -04:00
soc This pull request contains the following changes for MTD: 2019-05-12 17:57:52 -04:00
sound sound updates for 5.2-rc1 2019-05-09 08:26:55 -07:00
target scsi: target/iscsi: Handle too large immediate data buffers correctly 2019-04-12 20:20:06 -04:00
trace mm/vmscan: drop may_writepage and classzone_idx from direct reclaim begin template 2019-05-14 09:47:48 -07:00
uapi NFS client updates for Linux 5.2 2019-05-09 14:33:15 -07:00
video
xen