Zhong Jiang has reported a BUG_ON from huge_pte_alloc hitting when he
runs his database load with memory online and offline running in
parallel. The reason is that huge_pmd_share might detect a shared pmd
which is currently migrated and so it has migration pte which is
!pte_huge.
There doesn't seem to be any easy way to prevent from the race and in
fact seeing the migration swap entry is not harmful. Both callers of
huge_pte_alloc are prepared to handle them. copy_hugetlb_page_range
will copy the swap entry and make it COW if needed. hugetlb_fault will
back off and so the page fault is retries if the page is still under
migration and waits for its completion in hugetlb_fault.
That means that the BUG_ON is wrong and we should update it. Let's
simply check that all present ptes are pte_huge instead.
Link: http://lkml.kernel.org/r/20160721074340.GA26398@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: zhongjiang <zhongjiang@huawei.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In powerpc servers with large memory(32TB), we watched several soft
lockups for hugepage under stress tests.
The call traces are as follows:
1.
get_page_from_freelist+0x2d8/0xd50
__alloc_pages_nodemask+0x180/0xc20
alloc_fresh_huge_page+0xb0/0x190
set_max_huge_pages+0x164/0x3b0
2.
prep_new_huge_page+0x5c/0x100
alloc_fresh_huge_page+0xc8/0x190
set_max_huge_pages+0x164/0x3b0
This patch fixes such soft lockups. It is safe to call cond_resched()
there because it is out of spin_lock/unlock section.
Link: http://lkml.kernel.org/r/1469674442-14848-1-git-send-email-hejianet@gmail.com
Signed-off-by: Jia He <hejianet@gmail.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Apparently, the tools/testing version dates to a few flags ago, and
we've sprouted 4 new ones since. Keep in sync with the value in the
main tree...
Link: http://lkml.kernel.org/r/23400.1469702675@turing-police.cc.vt.edu
Signed-off-by: Valdis Kletnieks <valdis.kletnieks@vt.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Every swap-in anonymous page starts from inactive lru list's head. It
should be activated unconditionally when VM decide to reclaim because
page table entry for the page always usually has marked accessed bit.
Thus, their window size for getting a new referece is 2 * NR_inactive +
NR_active while others is NR_inactive + NR_active.
It's not fair that it has more chance to be referenced compared to other
newly allocated page which starts from active lru list's head.
Johannes:
: The page can still have a valid copy on the swap device, so prefering to
: reclaim that page over a fresh one could make sense. But as you point
: out, having it start inactive instead of active actually ends up giving it
: *more* LRU time, and that seems to be without justification.
Rik:
: The reason newly read in swap cache pages start on the inactive list is
: that we do some amount of read-around, and do not know which pages will
: get used.
:
: However, immediately activating the ones that DO get used, like your patch
: does, is the right thing to do.
Link: http://lkml.kernel.org/r/1469762740-17860-1-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Rik van Riel <riel@redhat.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
I ran into this:
BUG: sleeping function called from invalid context at mm/page_alloc.c:3784
in_atomic(): 0, irqs_disabled(): 0, pid: 1434, name: trinity-c1
2 locks held by trinity-c1/1434:
#0: (&mm->mmap_sem){......}, at: [<ffffffff810ce31e>] __do_page_fault+0x1ce/0x8f0
#1: (rcu_read_lock){......}, at: [<ffffffff81378f86>] filemap_map_pages+0xd6/0xdd0
CPU: 0 PID: 1434 Comm: trinity-c1 Not tainted 4.7.0+ #58
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
Call Trace:
dump_stack+0x65/0x84
panic+0x185/0x2dd
___might_sleep+0x51c/0x600
__might_sleep+0x90/0x1a0
__alloc_pages_nodemask+0x5b1/0x2160
alloc_pages_current+0xcc/0x370
pte_alloc_one+0x12/0x90
__pte_alloc+0x1d/0x200
alloc_set_pte+0xe3e/0x14a0
filemap_map_pages+0x42b/0xdd0
handle_mm_fault+0x17d5/0x28b0
__do_page_fault+0x310/0x8f0
trace_do_page_fault+0x18d/0x310
do_async_page_fault+0x27/0xa0
async_page_fault+0x28/0x30
The important bits from the above is that filemap_map_pages() is calling
into the page allocator while holding rcu_read_lock (sleeping is not
allowed inside RCU read-side critical sections).
According to Kirill Shutemov, the prefaulting code in do_fault_around()
is supposed to take care of this, but missing error handling means that
the allocation failure can go unnoticed.
We don't need to return VM_FAULT_OOM (or any other error) here, since we
can just let the normal fault path try again.
Fixes: 7267ec008b ("mm: postpone page table allocation until we have page to map")
Link: http://lkml.kernel.org/r/1469708107-11868-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: "Hillf Danton" <hillf.zj@alibaba-inc.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We found a dlm-blocked situation caused by continuous breakdown of
recovery masters described below. To solve this problem, we should
purge recovery lock once detecting recovery master goes down.
N3 N2 N1(reco master)
go down
pick up recovery lock and
begin recoverying for N2
go down
pick up recovery
lock failed, then
purge it:
dlm_purge_lockres
->DROPPING_REF is set
send deref to N1 failed,
recovery lock is not purged
find N1 go down, begin
recoverying for N1, but
blocked in dlm_do_recovery
as DROPPING_REF is set:
dlm_do_recovery
->dlm_pick_recovery_master
->dlmlock
->dlm_get_lock_resource
->__dlm_wait_on_lockres_flags(tmpres,
DLM_LOCK_RES_DROPPING_REF);
Fixes: 8c03439681 ("ocfs2/dlm: clear DROPPING_REF flag when the master goes down")
Link: http://lkml.kernel.org/r/578453AF.8030404@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We found a BUG situation that lockres is migrated during deref described
below. To solve the BUG, we could purge lockres directly when other
node says I did not have a ref. Additionally, we'd better purge lockres
if master goes down, as no one will response deref done.
Node 1 Node 2(old master) Node3(new master)
dlm_purge_lockres
send deref to N2
leave domain
migrate lockres to N3
finish migration
send do assert
master to N1
receive do assert msg
form N3, but can not
find lockres because
DROPPING_REF is set,
so the owner is still
N2.
receive deref from N1
and response -EINVAL
because lockres is migrated
BUG when receive -EINVAL
in dlm_drop_lockres_ref
Fixes: 842b90b624 ("ocfs2/dlm: return in progress if master can not clear the refmap bit right now")
Link: http://lkml.kernel.org/r/57845103.3070406@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We found a BUG situation in which DLM_LOCK_RES_DROPPING_REF is cleared
unexpected that described below. To solve the bug, we disable the
BUG_ON and purge lockres in dlm_do_local_recovery_cleanup.
Node 1 Node 2(master)
dlm_purge_lockres
dlm_deref_lockres_handler
DLM_LOCK_RES_SETREF_INPROG is set
response DLM_DEREF_RESPONSE_INPROG
receive DLM_DEREF_RESPONSE_INPROG
stop puring in dlm_purge_lockres
and wait for DLM_DEREF_RESPONSE_DONE
dispatch dlm_deref_lockres_worker
response DLM_DEREF_RESPONSE_DONE
receive DLM_DEREF_RESPONSE_DONE and
prepare to purge lockres
Node 2 goes down
find Node2 down and do local
clean up for Node2:
dlm_do_local_recovery_cleanup
-> clear DLM_LOCK_RES_DROPPING_REF
when purging lockres, BUG_ON happens
because DLM_LOCK_RES_DROPPING_REF is clear:
dlm_deref_lockres_done_handler
->BUG_ON(!(res->state & DLM_LOCK_RES_DROPPING_REF));
[akpm@linux-foundation.org: fix duplicated write to `ret']
Fixes: 60d663cb52 ("ocfs2/dlm: add DEREF_DONE message")
Link: http://lkml.kernel.org/r/57845055.9080702@huawei.com
Signed-off-by: Jun Piao <piaojun@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The testcase "mmaptruncate" in ocfs2 test suite always fails with ENOSPC
error on small volume (say less than 10G). This testcase repeatedly
performs "extend" and "truncate" on a file. Continuously, it truncates
the file to 1/2 of the size, and then extends to 100% of the size. The
main bitmap will quickly run out of space because the "truncate" code
prevent truncate log from being flushed by
ocfs2_schedule_truncate_log_flush(osb, 1), while truncate log may have
cached lots of clusters.
So retry to allocate after flushing truncate log when ENOSPC is
returned. And we cannot reuse the deleted blocks before the transaction
committed. Fortunately, we already have a function to do this -
ocfs2_try_to_free_truncate_log(). Just need to remove the "static"
modifier and put it into the right place.
The "unlock"/"lock" code isn't elegant, but there seems to be no better
option.
[zren@suse.com: locking fix]
Link: http://lkml.kernel.org/r/1468031546-4797-1-git-send-email-zren@suse.com
Link: http://lkml.kernel.org/r/1466586469-5541-1-git-send-email-zren@suse.com
Signed-off-by: Eric Ren <zren@suse.com>
Reviewed-by: Gang He <ghe@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We encountered a bug from the customer, the user did a fsck.ocfs2 on the
file system and exited unusually, the lockspace (with LVB size = 32) was
left in the kernel space, next, the user mounted this file system, the
kernel module did not create a new lockspace (LVB size = 64) via calling
dlm_new_lockspace() function in mounting stage, just used the existing
lockspace, created by the user space tool, this would lead the user was
not able to mount this file system from the other nodes, with the error
message like:
dlm: 032F5......: config mismatch: 64,0 nodeid 177127961: 32,0
(mount.ocfs2,26981,46):ocfs2_dlm_init:2995 ERROR: status = -71
ocfs2_mount_volume:1881 ERROR: status = -71
ocfs2_fill_super:1236 ERROR: status = -71
The user found it very difficult to find the root cause, then, we
brought out this patch to relieve such problem.
First, we add one more flag in calling dlm_new_lockspace() function, to
make sure the lockspace is created by kernel module itself, and this
change will not affect the backward compatibility.
Second, the obvious error message is reported in the kernel log, let the
user be more easy to find the root cause.
This patch will be used to insure the dlm lockspace is created by kernel
module when mounting a ocfs2 file system. There are two ways to create
a lockspace, from user space and kernel space, but the same name
lockspaces probably have different lvblen lengths/flags.
To avoid this mix using, we add one more flag DLM_LSFL_NEWEXCL, it will
make sure the dlm lockspace is created by kernel module when mounting.
Secondly, if a user space program (ocfs2-tools) is running on a file
system, the user tries to mount this file system in the cluster, DLM
module will return a -EEXIST or -EPROTO errno, we should give the user a
obvious error message, then, the user can let that user space tool exit
before mounting the file system again.
Link: http://lkml.kernel.org/r/1463731940-13044-2-git-send-email-ghe@suse.com
Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Bspec says:
"Restriction : SRD must not be enabled when the PSR Setup time from DPCD
00071h is greater than the time for vertical blank minus one line."
Let's check for that and disallow PSR if we exceed the limit.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add a small helper to parse the PSR setup time from the DPCD PSR
capabilities and return the value in microseconds.
v2: Don't waste so many bytes on the psr_setup_time_us[] table
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
NAND:
Updates from Boris:
"""
This pull request contains only one notable change:
* Addition of the MTK NAND controller driver
And a bunch of specific NAND driver improvements/fixes. Here are the
changes that are worth mentioning:
* A few fixes/improvements for the xway NAND controller driver
* A few fixes for the sunxi NAND controller driver
* Support for DMA in the sunxi NAND driver
* Support for the sunxi NAND controller IP embedded in A23/A33 SoCs
* Addition for bitflips detection in erased pages to the brcmnand driver
* Support for new brcmnand IPs
* Update of the OMAP-GPMC binding to support DMA channel description
"""
In addition, some small fixes around error handling, etc., as well as one
long-standing corner case issue (2.6.20, I think?) with writing 1 byte less
than a page.
NOR:
* Rework some error handling on reads and writes, so we can better handle (for
instance) SPI controllers which have limitations on their maximum transfer size
* Add new Cadence Quad SPI flash controller driver
* Add new Atmel QSPI flash controller driver
* Add new Hisilicon SPI flash controller driver
* Support a few new flash, and update supported features on others
* Fix the logic used for detecting a fully-unlocked flash
And other miscellaneous small fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXn/j0AAoJEFySrpd9RFgto58P/j1huB0d21zFen3teo8YKKr1
dLi65mFbqtpU1BLqD07uc2gsH17kvezFJCtx8H8Jp/yCjLF2kYIKL7wDTf6OJPtn
aYGS5dG5jhMIq+6CD2olKqy+IVLfL9GvCf44Z3fpVta5lOn09y7Jm0AkBjmJcH45
SdJi+iUkSKhqRY3O2udyauGyL4JWG7eHUonrG9g8ROrO0GWQkT4Ijm1ZwIBlFDkJ
e1N960OqPg5ISOzuTeM14Ok9IUyeb7wiXhqRfOJgzNk6iBcN1YC2ono3C7RH2sN7
wiyCqqUpDDCDBDiFdGOdpc9cjzNysrt02ypWRsZIpQVCm89nPLSutqQEWLuo0qzq
/eIzdwbk1AxX96CeQohOezqL+n6+RHP9AIvwzL9GeWjipD1LBvfM1l3CmuSKK5jb
bQ4CA/FVz1tO/25q8tuLJfpFzhFE2PC3pphVf8tREL/U6OR/97NgDMuQIuqiQpRc
4nJtu79yacAzEiztZh0bsx+t94QFE+kfs/6d8m+llLEyx2sI8HKZeDvNRwEB0OsD
wQ5bjyd54m7+i4H8njrnOTP+K2YrwNjGlbTo7qrRSpFMDr4mD0VQLap03Srvo5xV
OYB6uGZhGV2L3k6nG5wywM2z6Hw0QfHxjrpcLyG51xABmni2QF3lOdYSllRtRTXp
aYLfPUqgIgSUI2GTDHyW
=HQr0
-----END PGP SIGNATURE-----
Merge tag 'for-linus-20160801' of git://git.infradead.org/linux-mtd
Pull MTD updates from Brian Norris:
"NAND:
Quoting Boris:
'This pull request contains only one notable change:
- Addition of the MTK NAND controller driver
And a bunch of specific NAND driver improvements/fixes. Here are the
changes that are worth mentioning:
- A few fixes/improvements for the xway NAND controller driver
- A few fixes for the sunxi NAND controller driver
- Support for DMA in the sunxi NAND driver
- Support for the sunxi NAND controller IP embedded in A23/A33 SoCs
- Addition for bitflips detection in erased pages to the brcmnand driver
- Support for new brcmnand IPs
- Update of the OMAP-GPMC binding to support DMA channel description'
In addition, some small fixes around error handling, etc., as well
as one long-standing corner case issue (2.6.20, I think?) with
writing 1 byte less than a page.
NOR:
- rework some error handling on reads and writes, so we can better
handle (for instance) SPI controllers which have limitations on
their maximum transfer size
- add new Cadence Quad SPI flash controller driver
- add new Atmel QSPI flash controller driver
- add new Hisilicon SPI flash controller driver
- support a few new flash, and update supported features on others
- fix the logic used for detecting a fully-unlocked flash
And other miscellaneous small fixes"
* tag 'for-linus-20160801' of git://git.infradead.org/linux-mtd: (60 commits)
mtd: spi-nor: don't build Cadence QuadSPI on non-ARM
mtd: mtk-nor: remove duplicated include from mtk-quadspi.c
mtd: nand: fix bug writing 1 byte less than page size
mtd: update description of MTD_BCM47XXSFLASH symbol
mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller
mtd: spi-nor: Bindings for Cadence Quad SPI Flash Controller driver
mtd: nand: brcmnand: Change BUG_ON in brcmnand_send_cmd
mtd: pmcmsp-flash: Allocating too much in init_msp_flash()
mtd: maps: sa1100-flash: potential NULL dereference
mtd: atmel-quadspi: add driver for Atmel QSPI controller
mtd: nand: omap2: fix return value check in omap_nand_probe()
Documentation: atmel-quadspi: add binding file for Atmel QSPI driver
mtd: spi-nor: add hisilicon spi-nor flash controller driver
mtd: spi-nor: support dual, quad, and WP for Gigadevice
mtd: spi-nor: Added support for n25q00a.
memory: Update dependency of IFC for Layerscape
mtd: nand: jz4780: Update MODULE_AUTHOR email address
mtd: nand: sunxi: prevent a small memory leak
mtd: nand: sunxi: add reset line support
mtd: nand: sunxi: update DT bindings
...
Pull misc kbuild updates from Michal Marek:
- coccicheck script improvements by Luis Rodriguez and Deepa Dinamani
- new coccinelle patches by Yann Droneaud and Vaishali Thakkar
- debian packaging fixes by Wilfried Klaebe, Henning Schild and Marcin
Mielniczuk
* 'misc' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
Fix the Debian packaging script on systems with no codename
builddeb: fix file permissions before packaging
scripts/coccinelle: require coccinelle >= 1.0.4 on device_node_continue.cocci
coccicheck: refer to Documentation/coccinelle.txt and wiki
coccicheck: add support for requring a coccinelle version
scripts: add Linux .cocciconfig for coccinelle
coccicheck: replace --very-quiet with --quiet when debugging
coccicheck: add support for DEBUG_FILE
coccicheck: enable parmap support
coccicheck: make SPFLAGS more useful
coccicheck: move spatch binary check up
builddeb: really include objtool binary in headers package
coccinelle: catch krealloc() on devm_*() allocated memory
coccinelle: recognize more devm_* memory allocation functions
coccinelle: also catch kzfree() issues
coccicheck: Allow for overriding spatch flags
Coccinelle: noderef: Add new rules and correct the old rule
Pull kbuild updates from Michal Marek:
- GCC plugin support by Emese Revfy from grsecurity, with a fixup from
Kees Cook. The plugins are meant to be used for static analysis of
the kernel code. Two plugins are provided already.
- reduction of the gcc commandline by Arnd Bergmann.
- IS_ENABLED / IS_REACHABLE macro enhancements by Masahiro Yamada
- bin2c fix by Michael Tautschnig
- setlocalversion fix by Wolfram Sang
* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild:
gcc-plugins: disable under COMPILE_TEST
kbuild: Abort build on bad stack protector flag
scripts: Fix size mismatch of kexec_purgatory_size
kbuild: make samples depend on headers_install
Kbuild: don't add obj tree in additional includes
Kbuild: arch: look for generated headers in obtree
Kbuild: always prefix objtree in LINUXINCLUDE
Kbuild: avoid duplicate include path
Kbuild: don't add ../../ to include path
vmlinux.lds.h: replace config_enabled() with IS_ENABLED()
kconfig.h: allow to use IS_{ENABLE,REACHABLE} in macro expansion
kconfig.h: use already defined macros for IS_REACHABLE() define
export.h: use __is_defined() to check if __KSYM_* is defined
kconfig.h: use __is_defined() to check if MODULE is defined
kbuild: setlocalversion: print error to STDERR
Add sancov plugin
Add Cyclomatic complexity GCC plugin
GCC plugin infrastructure
Shared library support
VGIC implementation.
- s390: support for trapping software breakpoints, nested virtualization
(vSIE), the STHYI opcode, initial extensions for CPU model support.
- MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
preliminary to this and the upcoming support for hardware virtualization
extensions.
- x86: support for execute-only mappings in nested EPT; reduced vmexit
latency for TSC deadline timer (by about 30%) on Intel hosts; support for
more than 255 vCPUs.
- PPC: bugfixes.
The ugly bit is the conflicts. A couple of them are simple conflicts due
to 4.7 fixes, but most of them are with other trees. There was definitely
too much reliance on Acked-by here. Some conflicts are for KVM patches
where _I_ gave my Acked-by, but the worst are for this pull request's
patches that touch files outside arch/*/kvm. KVM submaintainers should
probably learn to synchronize better with arch maintainers, with the
latter providing topic branches whenever possible instead of Acked-by.
This is what we do with arch/x86. And I should learn to refuse pull
requests when linux-next sends scary signals, even if that means that
submaintainers have to rebase their branches.
Anyhow, here's the list:
- arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
by the nvdimm tree. This tree adds handle_preemption_timer and
EXIT_REASON_PREEMPTION_TIMER at the same place. In general all mentions
of pcommit have to go.
There is also a conflict between a stable fix and this patch, where the
stable fix removed the vmx_create_pml_buffer function and its call.
- virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
This tree adds kvm_io_bus_get_dev at the same place.
- virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
file was completely removed for 4.8.
- include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
this is a change that should have gone in through the irqchip tree and
pulled by kvm-arm. I think I would have rejected this kvm-arm pull
request. The KVM version is the right one, except that it lacks
GITS_BASER_PAGES_SHIFT.
- arch/powerpc: what a mess. For the idle_book3s.S conflict, the KVM
tree is the right one; everything else is trivial. In this case I am
not quite sure what went wrong. The commit that is causing the mess
(fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
and arch/powerpc/kvm/. It's large, but at 396 insertions/5 deletions
I guessed that it wasn't really possible to split it and that the 5
deletions wouldn't conflict. That wasn't the case.
- arch/s390: also messy. First is hypfs_diag.c where the KVM tree
moved some code and the s390 tree patched it. You have to reapply the
relevant part of commits 6c22c98637, plus all of e030c1125e, to
arch/s390/kernel/diag.c. Or pick the linux-next conflict
resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
Second, there is a conflict in gmap.c between a stable fix and 4.8.
The KVM version here is the correct one.
I have pushed my resolution at refs/heads/merge-20160802 (commit
3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
=42ti
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
- ARM: GICv3 ITS emulation and various fixes. Removal of the
old VGIC implementation.
- s390: support for trapping software breakpoints, nested
virtualization (vSIE), the STHYI opcode, initial extensions
for CPU model support.
- MIPS: support for MIPS64 hosts (32-bit guests only) and lots
of cleanups, preliminary to this and the upcoming support for
hardware virtualization extensions.
- x86: support for execute-only mappings in nested EPT; reduced
vmexit latency for TSC deadline timer (by about 30%) on Intel
hosts; support for more than 255 vCPUs.
- PPC: bugfixes.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
KVM: PPC: Introduce KVM_CAP_PPC_HTM
MIPS: Select HAVE_KVM for MIPS64_R{2,6}
MIPS: KVM: Reset CP0_PageMask during host TLB flush
MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
MIPS: KVM: Sign extend MFC0/RDHWR results
MIPS: KVM: Fix 64-bit big endian dynamic translation
MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
MIPS: KVM: Use 64-bit CP0_EBase when appropriate
MIPS: KVM: Set CP0_Status.KX on MIPS64
MIPS: KVM: Make entry code MIPS64 friendly
MIPS: KVM: Use kmap instead of CKSEG0ADDR()
MIPS: KVM: Use virt_to_phys() to get commpage PFN
MIPS: Fix definition of KSEGX() for 64-bit
KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
kvm: x86: nVMX: maintain internal copy of current VMCS
KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
KVM: arm64: vgic-its: Simplify MAPI error handling
KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
...
Describe use of jiffy-based timeout values involved in inode maintenance.
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
The userspace component attempts to do this, but this will prevent
us from even needing to go into userspace to satisfy certain getattr
requests.
Signed-off-by: Martin Brandenburg <martin@omnibond.com>
While running tools/testing/selftests test suite with KASAN, Dmitry
Vyukov hit the following use-after-free report:
==================================================================
BUG: KASAN: use-after-free in hist_unreg_all+0x1a1/0x1d0 at addr
ffff880031632cc0
Read of size 8 by task ftracetest/7413
==================================================================
BUG kmalloc-128 (Not tainted): kasan: bad access detected
------------------------------------------------------------------
This fixes the problem, along with the same problem in
hist_enable_unreg_all().
Link: http://lkml.kernel.org/r/c3d05b79e42555b6e36a3a99aae0e37315ee5304.1467247517.git.tom.zanussi@linux.intel.com
Cc: Dmitry Vyukov <dvyukov@google.com>
[Copied Steve's hist_enable_unreg_all() fix to hist_unreg_all()]
Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
With the latest gcc compilers, they give a warning if
__builtin_return_address() parameter is greater than 0. That is because if
it is used by a function called by a top level function (or in the case of
the kernel, by assembly), it can try to access stack frames outside the
stack and crash the system.
The tracing system uses __builtin_return_address() of up to 2! But it is
well aware of the dangers that it may have, and has even added precautions
to protect against it (see the thunk code in arch/x86/entry/thunk*.S)
Linus originally added KBUILD_CFLAGS that would suppress the warning for the
entire kernel, as simply adding KBUILD_CFLAGS to the tracing directory
wouldn't work. The tracing directory plays a bit with the CFLAGS and
requires a little more logic.
This adds that special logic to only suppress the warning for the tracing
directory. If it is used anywhere else outside of tracing, the warning will
still be triggered.
Link: http://lkml.kernel.org/r/20160728223043.51996267@grimm.local.home
Tested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
glibc recently did a sync up (94e73c95d9b5 "elf.h: Sync with the gabi
webpage") that added a #define for EM_METAG but did not add relocations
This triggers build errors:
scripts/recordmcount.c: In function 'do_file':
scripts/recordmcount.c:466:28: error: 'R_METAG_ADDR32' undeclared (first use in this function)
case EM_METAG: reltype = R_METAG_ADDR32;
^~~~~~~~~~~~~~
scripts/recordmcount.c:466:28: note: each undeclared identifier is reported only once for each function it appears in
scripts/recordmcount.c:468:20: error: 'R_METAG_NONE' undeclared (first use in this function)
rel_type_nop = R_METAG_NONE;
^~~~~~~~~~~~
Work around this change with some more #ifdefery for the relocations.
Fedora Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1354034
Link: http://lkml.kernel.org/r/1468005530-14757-1-git-send-email-labbott@redhat.com
Cc: stable@vger.kernel.org # v3.9+
Cc: James Hogan <james.hogan@imgtec.com>
Fixes: 00512bdd45 ("metag: ftrace support")
Reported-by: Ross Burton <ross.burton@intel.com>
Signed-off-by: Laura Abbott <labbott@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Pull more s390 updates from Martin Schwidefsky:
- some cleanup for the hugetlbfs pte/pmd conversion functions
- the code to check for the minimum CPU type is converted from
assembler to C and an informational message is added in case the CPU
is not new enough to run the kernel
- bug fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/ftrace/jprobes: Fix conflict between jprobes and function graph tracing
s390: Define AT_VECTOR_SIZE_ARCH for ARCH_DLINFO
s390/zcrypt: fix possible memory leak in ap_module_init()
s390/numa: only set possible nodes within node_possible_map
s390/als: fix compile with gcov enabled
s390/facilities: do not generate DWORDS define anymore
s390/als: print missing facilities on facility mismatch
s390/als: print machine type on facility mismatch
s390/als: convert architecture level set code to C
s390/sclp: move uninitialized data to data section
s390/zcrypt: Fix zcrypt suspend/resume behavior
s390/cio: fix premature wakeup during chp configure
s390/cio: convert cfg_lock mutex to spinlock
s390/mm: clean up pte/pmd encoding
The logic was inverted here, set the bit if frames are pending.
Fixes: ba8c3d6f16 ("mac80211: add an intermediate software queue implementation")
Signed-off-by: Felix Fietkau <nbd@nbd.name>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
The switch on chandef->width is missing a break on the
NL8211_CHAN_WIDTH_80P80 case; currently we get a WARN_ON when
center_freq2 is non-zero because of the missing break.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
100g support is not available in MSI mode. Failing the driver load in this scenario.
Please consider applying this to `net'.
Signed-off-by: Sudarsana Reddy Kalluru <sudarsana.kalluru@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Chen says:
====================
add missing of_node_put after calling of_parse_phandle
This patch set fixes missing of_node_put issue at ethernet driver.
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
The compilation test has passed by using allmodconfig for drivers/net/ethernet.
Changes for v2:
- If the device node is local variable, it can be put in the same function.
- If the device node will be used the whole driver life cycle,
it should be put (call of_node_put) at driver's remove.
Patch [4, 5, 9, 14, 15/15]
- Fix the issue that the node still be used at error patch [6/15]
- Add acked for patch [11,12/15]
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
This commit fixes both local (in stmmac_axi_setup) and global
(plat->phy_node) device_node for this issue, and using the
correct device node when tries to put node at stmmac_probe_config_dt
for error path.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
of_node_put needs to be called when the device node which is got
from of_parse_phandle (or of_node_get) has finished using.
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If tx timeout event occur, kernel will call rtl8139_tx_timeout_task() to reset
hardware. But in this function, driver does not stop tx and rx function before
reset hardware, that will cause system hang.
In this patch, add stop tx and rx function before reset hardware.
Signed-off-by: Chunhao Lin <hau@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix to return error code -EINVAL instead of 0 when EQ elements is
too larger, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge drm updates from Dave Airlie:
"This is the main drm pull request for 4.8.
I'm down with a cold at the moment so hopefully this isn't in too bad
a state, I finished pulling stuff last week mostly (nouveau fixes just
went in today), so only this message should be influenced by illness.
Apologies to anyone who's major feature I missed :-)
Core:
Lockless GEM BO freeing
Non-blocking atomic work
Documentation changes (rst/sphinx)
Prep for new fencing changes
Simple display helpers
Master/auth changes
Register/unregister rework
Loads of trivial patches/fixes.
New stuff:
ARM Mali display driver (not the 3D chip)
sii902x RGB->HDMI bridge
Panel:
Support for new panels
Improved backlight support
Bridge:
Convert ADV7511 to bridge driver
ADV7533 support
TC358767 (DSI/DPI to eDP) encoder chip support
i915:
BXT support enabled by default
GVT-g infrastructure
GuC command submission and fixes
BXT workarounds
SKL/BKL workarounds
Demidlayering device registration
Thundering herd fixes
Missing pci ids
Atomic updates
amdgpu/radeon:
ATPX improvements for better dGPU power control on PX systems
New power features for CZ/BR/ST
Pipelined BO moves and evictions in TTM
GPU scheduler improvements
GPU reset improvements
Overclocking on dGPUs with amdgpu
Polaris powermanagement enabled
nouveau:
GK20A/GM20B volt and clock improvements.
Initial support for GP100/GP104 GPUs, GP104 will not yet support
acceleration due to NVIDIA having not released firmware for them as of yet.
exynos:
Exynos5433 SoC with IOMMU support.
vc4:
Shader validation for branching
imx-drm:
Atomic mode setting conversion
Reworked DMFC FIFO allocation
External bridge support
analogix-dp:
RK3399 eDP support
Lots of fixes.
rockchip:
Lots of small fixes.
msm:
DT bindings cleanups
Shrinker and madvise support
ASoC HDMI codec support
tegra:
Host1x driver cleanups
SOR reworking for DP support
Runtime PM support
omapdrm:
PLL enhancements
Header refactoring
Gamma table support
arcgpu:
Simulator support
virtio-gpu:
Atomic modesetting fixes.
rcar-du:
Misc fixes.
mediatek:
MT8173 HDMI support
sti:
ASOC HDMI codec support
Minor fixes
fsl-dcu:
Suspend/resume support
Bridge support
amdkfd:
Minor fixes.
etnaviv:
Enable GPU clock gating
hisilicon:
Vblank and other fixes"
* tag 'drm-for-v4.8' of git://people.freedesktop.org/~airlied/linux: (1575 commits)
drm/nouveau/gr/nv3x: fix instobj write offsets in gr setup
drm/nouveau/acpi: fix lockup with PCIe runtime PM
drm/nouveau/acpi: check for function 0x1B before using it
drm/nouveau/acpi: return supported DSM functions
drm/nouveau/acpi: ensure matching ACPI handle and supported functions
drm/nouveau/fbcon: fix font width not divisible by 8
drm/amd/powerplay: remove enable_clock_power_gatings_tasks from initialize and resume events
drm/amd/powerplay: move clockgating to after ungating power in pp for uvd/vce
drm/amdgpu: add query device id and revision id into system info entry at CGS
drm/amdgpu: add new definition in bif header
drm/amd/powerplay: rename smum header guards
drm/amdgpu: enable UVD context buffer for older HW
drm/amdgpu: fix default UVD context size
drm/amdgpu: fix incorrect type of info_id
drm/amdgpu: make amdgpu_cgs_call_acpi_method as static
drm/amdgpu: comment out unused defaults_staturn_pro static const structure to fix the build
drm/amdgpu: enable UVD VM only on polaris
drm/amdgpu: increase timeout of IB test
drm/amdgpu: add destroy session when generate VCE destroy msg.
drm/amd: fix deadlock of job_list_lock V2
...