Currently we are implementing vmalloc_to_pfn() as a wrapper around
vmalloc_to_page(), which is implemented as follow:
1. walks the page talbes to generates the corresponding pfn,
2. then converts the pfn to struct page,
3. returns it.
And vmalloc_to_pfn() re-wraps vmalloc_to_page() to get the pfn.
This seems too circuitous, so this patch reverses the way: implement
vmalloc_to_page() as a wrapper around vmalloc_to_pfn(). This makes
vmalloc_to_pfn() and vmalloc_to_page() slightly more efficient.
No functional change.
Signed-off-by: Jianyu Zhan <nasa4836@gmail.com>
Cc: Vladimir Murzin <murzin.v@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mempolicies only exist for CONFIG_NUMA configurations. Therefore, a
certain class of functions are unneeded in configurations where
CONFIG_NUMA is disabled such as functions that duplicate existing
mempolicies, lookup existing policies, set certain mempolicy traits, or
test mempolicies for certain attributes.
Remove the unneeded functions so that any future callers get a compile-
time error and protect their code with CONFIG_NUMA as required.
Signed-off-by: David Rientjes <rientjes@google.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When copy_hugetlb_page_range() is called to copy a range of hugetlb
mappings, the secondary MMUs are not notified if there is a protection
downgrade, which breaks COW semantics in KVM.
This patch adds the necessary MMU notifier calls.
Signed-off-by: Andreas Sandberg <andreas@sandberg.pp.se>
Acked-by: Steve Capper <steve.capper@linaro.org>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If DEBUG_SPINLOCK and DEBUG_LOCK_ALLOC are enabled spinlock_t on x86_64
is 72 bytes. For page->ptl they will be allocated from kmalloc-96 slab,
so we loose 24 on each. An average system can easily allocate few tens
thousands of page->ptl and overhead is significant.
Let's create a separate slab for page->ptl allocation to solve this.
To make sure that it really works this time, some numbers from my test
machine (just booted, no load):
Before:
# grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo
kmalloc-96 31987 32190 128 30 1 : tunables 120 60 8 : slabdata 1073 1073 92
After:
# grep '^\(kmalloc-96\|page->ptl\)' /proc/slabinfo
page->ptl 27516 28143 72 53 1 : tunables 120 60 8 : slabdata 531 531 9
kmalloc-96 3853 5280 128 30 1 : tunables 120 60 8 : slabdata 176 176 0
Note that the patch is useful not only for debug case, but also for
PREEMPT_RT, where spinlock_t is always bloated.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Yasuaki Ishimatsu reported memory hot-add spent more than 5 _hours_ on
9TB memory machine since onlining memory sections is too slow. And we
found out setup_zone_migrate_reserve spent >90% of the time.
The problem is, setup_zone_migrate_reserve scans all pageblocks
unconditionally, but it is only necessary if the number of reserved
block was reduced (i.e. memory hot remove).
Moreover, maximum MIGRATE_RESERVE per zone is currently 2. It means
that the number of reserved pageblocks is almost always unchanged.
This patch adds zone->nr_migrate_reserve_block to maintain the number of
MIGRATE_RESERVE pageblocks and it reduces the overhead of
setup_zone_migrate_reserve dramatically. The following table shows time
of onlining a memory section.
Amount of memory | 128GB | 192GB | 256GB|
---------------------------------------------
linux-3.12 | 23.9 | 31.4 | 44.5 |
This patch | 8.3 | 8.3 | 8.6 |
Mel's proposal patch | 10.9 | 19.2 | 31.3 |
---------------------------------------------
(millisecond)
128GB : 4 nodes and each node has 32GB of memory
192GB : 6 nodes and each node has 32GB of memory
256GB : 8 nodes and each node has 32GB of memory
(*1) Mel proposed his idea by the following threads.
https://lkml.org/lkml/2013/10/30/272
[akpm@linux-foundation.org: tweak comment]
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Reported-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Tested-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Many load balancing and workload placing programs check /proc/meminfo to
estimate how much free memory is available. They generally do this by
adding up "free" and "cached", which was fine ten years ago, but is
pretty much guaranteed to be wrong today.
It is wrong because Cached includes memory that is not freeable as page
cache, for example shared memory segments, tmpfs, and ramfs, and it does
not include reclaimable slab memory, which can take up a large fraction
of system memory on mostly idle systems with lots of files.
Currently, the amount of memory that is available for a new workload,
without pushing the system into swap, can be estimated from MemFree,
Active(file), Inactive(file), and SReclaimable, as well as the "low"
watermarks from /proc/zoneinfo.
However, this may change in the future, and user space really should not
be expected to know kernel internals to come up with an estimate for the
amount of free memory.
It is more convenient to provide such an estimate in /proc/meminfo. If
things change in the future, we only have to change it in one place.
Signed-off-by: Rik van Riel <riel@redhat.com>
Reported-by: Erik Mouw <erik.mouw_2@nxp.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
get_huge_page_tail()->compound_head() looks confusing. Every caller
must check PageTail(page), otherwise atomic_inc(&page->_mapcount) is
simply wrong if this page is compound-trans-head.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Jones <davej@redhat.com>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Cleanup. Change __get_page_tail_foll() to use get_huge_page_tail()
to avoid the code duplication.
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Jones <davej@redhat.com>
Cc: Darren Hart <dvhart@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
No actual need of it. So keep it internal.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tweak it so save a tab stop, make code layout slightly less nutty.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This skips the _mapcount mangling for slab and hugetlbfs pages.
The main trouble in doing this is to guarantee that PageSlab and
PageHeadHuge remains constant for all get_page/put_page run on the tail
of slab or hugetlbfs compound pages. Otherwise if they're set during
get_page but not set during put_page, the _mapcount of the tail page
would underflow.
PageHeadHuge will remain true until the compound page is released and
enters the buddy allocator so it won't risk to change even if the tail
page is the last reference left on the page.
PG_slab instead is cleared before the slab frees the head page with
put_page, so if the tail pin is released after the slab freed the page,
we would have a problem. But in the slab case the tail pin cannot be
the last reference left on the page. This is because the slab code is
free to reuse the compound page after a kfree/kmem_cache_free without
having to check if there's any tail pin left. In turn all tail pins
must be always released while the head is still pinned by the slab code
and so we know PG_slab will be still set too.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently we don't clobber page_tail->first_page during split_huge_page,
so compound_trans_head can be set to compound_head without adverse
effects, and this mostly optimizes away a smp_rmb.
It looks worthwhile to keep around the implementation that doesn't relay
on page_tail->first_page not to be clobbered, because it would be
necessary if we'll decide to enforce page->private to zero at all times
whenever PG_private is not set, also for anonymous pages. For anonymous
pages enforcing such an invariant doesn't matter as anonymous pages
don't use page->private so we can get away with this microoptimization.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We don't actually need a reference on the head page in the slab and
hugetlbfs paths, as long as we add a smp_rmb() which should be faster
than get_page_unless_zero.
[akpm@linux-foundation.org: fix typo in comment]
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
get_page_foll() is more optimal and is always safe to use under the PT
lock. More so for hugetlbfs as there's no risk of race conditions with
split_huge_page regardless of the PT lock.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Tested-by: Khalid Aziz <khalid.aziz@oracle.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Johannes Weiner <jweiner@redhat.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Dave Jiang reported that he was seeing oopses when running NUMA systems
and default_hugepagesz=1G. I traced the issue down to
migrate_page_copy() trying to use the same code for hugetlb pages and
transparent hugepages. It should not have been trying to pass thp pages
in there.
So, add some VM_BUG_ON()s for the next hapless VM developer that tries
the same thing.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Tested-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
{,set}page_address() are macros if WANT_PAGE_VIRTUAL. If
!WANT_PAGE_VIRTUAL, they're plain C functions.
If someone calls them with a void *, this pointer is auto-converted to
struct page * if !WANT_PAGE_VIRTUAL, but causes a build failure on
architectures using WANT_PAGE_VIRTUAL (arc, m68k and sparc64):
drivers/md/bcache/bset.c: In function `__btree_sort':
drivers/md/bcache/bset.c:1190: warning: dereferencing `void *' pointer
drivers/md/bcache/bset.c:1190: error: request for member `virtual' in something not a structure or union
Convert them to static inline functions to fix this. There are already
plenty of users of struct page members inside <linux/mm.h>, so there's
no reason to keep them as macros.
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Tested-by: David Rientjes <rientjes@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The ramfs is always built in. It will never be modular, so using
module_init as an alias for __initcall is rather misleading.
Fix this up now, so that we can relocate module_init from init.h into
module.h in the future. If we don't do this, we'd have to add module.h
to obviously non-modular code, and that would be a worse thing.
Note that direct use of __initcall is discouraged, vs. one of the
priority categorized subgroups. As __initcall gets mapped onto
device_initcall, our use of fs_initcall (which makes sense for fs code)
will thus change this registration from level 6-device to level 5-fs
(i.e. slightly earlier). However no observable impact of that small
difference has been observed during testing, or is expected.
Also note that this change uncovers a missing semicolon bug in the
registration of the initcall.
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
On fail path alloc_super() calls destroy_super(), which issues a warning
if the sb's s_mounts list is not empty, in particular if it has not been
initialized. That said s_mounts must be initialized in alloc_super()
before any possible failure, but currently it is initialized close to
the end of the function leading to a useless warning dumped to log if
either percpu_counter_init() or list_lru_init() fails. Let's fix this.
Signed-off-by: Vladimir Davydov <vdavydov@parallels.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The compat_do_readv_writev() function was doing a verify_area on the
incoming iov, but the nr_segs value is not checked. If someone passes
in a -1 for nr_segs, for instance, the function should return an EINVAL.
However, it returns a EFAULT because the verify_area fails because it is
checking an array of size MAX_UINT. The check is bogus, anyway, because
the next check, compat_rw_copy_check_uvector(), will do all the
necessary checking, anyway. The non-compat do_readv_writev() function
doesn't do this check, so I think it's safe to just remove the code.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We cap "nmsgs" at I2C_RDRW_IOCTL_MAX_MSGS (42) but the current code
allows negative values. It's harmless but it makes my static checker
upset so I've made nsmgs unsigned.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Uninline vast tracts of nested inline functions in
include/linux/posix_acl.h.
This reduces the text+data+bss size of x86_64 allyesconfig vmlinux by
8026 bytes.
The patch also regularises the positioning of the EXPORT_SYMBOLs in
posix_acl.c.
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: J. Bruce Fields <bfields@fieldses.org>
Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
Tested-by: Benny Halevy <bhalevy@primarydata.com>
Cc: Benny Halevy <bhalevy@panasas.com>
Cc: Andreas Gruenbacher <agruen@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
arch/sh/kernel/kgdb.c: In function 'sleeping_thread_to_gdb_regs':
arch/sh/kernel/kgdb.c:225:32: error: implicit declaration of function 'task_stack_page' [-Werror=implicit-function-declaration]
arch/sh/kernel/kgdb.c:242:23: error: dereferencing pointer to incomplete type
arch/sh/kernel/kgdb.c:243:22: error: dereferencing pointer to incomplete type
arch/sh/kernel/kgdb.c: In function 'singlestep_trap_handler':
arch/sh/kernel/kgdb.c:310:27: error: 'SIGTRAP' undeclared (first use in this function)
arch/sh/kernel/kgdb.c:310:27: note: each undeclared identifier is reported only once for each function it appears in
This was introduced by commit 16559ae48c ("kgdb: remove #include
<linux/serial_8250.h> from kgdb.h").
[geert@linux-m68k.org: reworded and reformatted]
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Geert Uytterhoeven <geert+renesas@linux-m68k.org>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 nodes cluster, say Node A and Node B, mount the same ocfs2 volume, and
create a file 1.
Node A Node B
open 1, get open lock
rm 1, and then add 1 to orphan_dir
storage link down,
o2hb_write_timeout
->o2quo_disk_timeout
->emergency_restart
at the moment, Node B dismount and do
ocfs2rec simultaneously
1) ocfs2_dismount_volume
->ocfs2_recovery_exit
->wait_event(osb->recovery_event)
->flush_workqueue(ocfs2_wq)
2) ocfs2rec
->queue_work(&journal->j_recovery_work)
->ocfs2_recover_orphans
->ocfs2_commit_truncate
->queue_delayed_work(&osb->osb_truncate_log_wq)
In ocfs2_recovery_exit, it flushes workqueue and then releases system
inodes. When doing ocfs2rec, it will call ocfs2_flush_truncate_log
which will try to get sys_root_inode, and NULL pointer dereference
occurs.
Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Signed-off-by: joyce <xuejiufei@huawei.com>
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
An unreserve space ioctl OCFS2_IOC_UNRESVSP/64 should reject a negative
length.
Orabug:14789508
Signed-off-by: Tariq Saseed <tariq.x.saeed@oracle.com>
Signed-off-by: Srinivas Eeda <srinivas.eeda@oracle.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes the following sparse warning:
fs/ocfs2/stack_user.c:930:32: warning:
symbol 'ocfs2_ls_ops' was not declared. Should it be static?
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Adjust minlen with discard_granularity for FITRIM ioctl(2) if the given
minimum size in bytes is less than it because, discard granularity is
used to tell us that the minimum size of extent that can be discarded by
the storage device.
This is inspired by ext4 commit 5c2ed62fd4 ("ext4: Adjust minlen with
discard_granularity in the FITRIM ioctl") from Lukas Czerner.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For FITRIM ioctl(2), we should not keep silence if the given range
length ls less than a block size as there is no data blocks would be
discareded. Hence it should return EINVAL instead. This issue can be
verified via xfstests/generic/288 which is used for FITRIM argument
handling tests.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For FITRIM ioctl(2), we should return EOPNOTSUPP to inform the user that
the storage device does not support discard if it is, otherwise return
success would confuse the user even though there is no free blocks were
trimmed at all.
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
ocfs2_alloc_dinode_update_counts() and ocfs2_block_group_set_bits() are
already provided in suballoc.c. So, the same functions in
move_extents.c are not needed any more.
Declare the functions in suballoc.h and remove redundant functions in
move_extents.c.
Signed-off-by: Younger Liu <liuyiyang@hisense.com>
Cc: Younger Liu <younger.liucn@gmail.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Attempt to use the new DLM operations. If it is not supported, use the
traditional ocfs2_controld.
To exchange ocfs2 versioning, we use the LVB of the version dlm lock.
It first attempts to take the lock in EX mode (non-blocking). If
successful (which means it is the first mount), it writes the version
number and downconverts to PR lock. If it is unsuccessful, it reads the
version from the lock.
If this becomes the standard (with o2cb as well), it could simplify
userspace tools to check if the filesystem is mounted on other nodes.
Dan: Since ocfs2_protocol_version are two u8 values, the additional
checks with LONG* don't make sense.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Use the native DLM locks for version control negotiation. Most of the
framework is taken from gfs2/lock_dlm.c
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is done to differentiate between using and not using controld and
use the connection information accordingly.
We need to be backward compatible. So, we use a new enum
ocfs2_connection_type to identify when controld is used and when it is
not.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We perform this because the DLM recovery callbacks will require the
ocfs2_live_connection structure to record the node information when
dlm_new_lockspace() is updated (in the last patch of the series).
Before calling dlm_new_lockspace(), we need the structure ready for the
.recover_done() callback, which would set oc_this_node. This is the
reason we allocate ocfs2_live_connection beforehand in user_connect().
[AKPM] rc initialization is not required because it assigned in case of
errors. It will be cleared by compiler anyways.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reveiwed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
These are the callbacks called by the fs/dlm code in case the membership
changes. If there is a failure while/during calling any of these, the
DLM creates a new membership and relays to the rest of the nodes.
- recover_prep() is called when DLM understands a node is down.
- recover_slot() is called once all nodes have acknowledged
recover_prep and recovery can begin.
- recover_done() is called once the recovery is complete. It returns
the new membership.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This is an effort of removing ocfs2_controld.pcmk and getting ocfs2 DLM
handling up to the times with respect to DLM (>=4.0.1) and corosync
(2.3.x). AFAIK, cman also is being phased out for a unified corosync
cluster stack.
fs/dlm performs all the functions with respect to fencing and node
management and provides the API's to do so for ocfs2. For all future
references, DLM stands for fs/dlm code.
The advantages are:
+ No need to run an additional userspace daemon (ocfs2_controld)
+ No controld device handling and controld protocol
+ Shifting responsibilities of node management to DLM layer
For backward compatibility, we are keeping the controld handling code.
Once enough time has passed we can remove a significant portion of the
code. This was tested by using the kernel with changes on older
unmodified tools. The kernel used ocfs2_controld as expected, and
displayed the appropriate warning message.
This feature requires modification in the userspace ocfs2-tools. The
changes can be found at: https://github.com/goldwynr/ocfs2-tools branch:
nocontrold Currently, not many checks are present in the userspace code,
but that would change soon.
This patch (of 6):
Add clustername to cluster connection.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The versioning information is confusing for end-users. The numbers are
stuck at 1.5.0 when the tools version have moved to 1.8.2. Remove the
versioning system in the OCFS2 modules and let the kernel version be the
guide to debug issues.
Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
Acked-by: Sunil Mushran <sunil.mushran@gmail.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Acked-by: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Commit 5fbbf8a1a9 ("Score: The commit is for compiling successfully.")
re-introduced "select HAVE_GENERIC_HARDIRQS" in v3.12-rc4, which had
just been removed in v3.12-rc1 by 0244ad004a ("Remove GENERIC_HARDIRQ
config option").
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Chen Liqin <liqin.linux@gmail.com>
Cc: Lennox Wu <lennox.wu@gmail.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
dma_pte_free_level() has an off-by-one error when checking whether a pte
is completely covered by a range. Take for example the case of
attempting to free pfn 0x0 - 0x1ff, ie. 512 entries covering the first
2M superpage.
The level_size() is 0x200 and we test:
static void dma_pte_free_level(...
...
if (!(0 > 0 || 0x1ff < 0 + 0x200)) {
...
}
Clearly the 2nd test is true, which means we fail to take the branch to
clear and free the pagetable entry. As a result, we're leaking
pagetables and failing to install new pages over the range.
This was found with a PCI device assigned to a QEMU guest using vfio-pci
without a VGA device present. The first 1M of guest address space is
mapped with various combinations of 4K pages, but eventually the range
is entirely freed and replaced with a 2M contiguous mapping.
intel-iommu errors out with something like:
ERROR: DMA PTE for vPFN 0x0 already set (to 5c2b8003 not 849c00083)
In this case 5c2b8003 is the pointer to the previous leaf page that was
neither freed nor cleared and 849c00083 is the superpage entry that
we're trying to replace it with.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
We usually rely on the fact that struct members not specified in the
initializer are set to NULL. So do that with fsnotify function pointers
as well.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After removing event structure creation from the generic layer there is
no reason for separate .should_send_event and .handle_event callbacks.
So just remove the first one.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Currently fsnotify framework creates one event structure for each
notification event and links this event into all interested notification
groups. This is done so that we save memory when several notification
groups are interested in the event. However the need for event
structure shared between inotify & fanotify bloats the event structure
so the result is often higher memory consumption.
Another problem is that fsnotify framework keeps path references with
outstanding events so that fanotify can return open file descriptors
with its events. This has the undesirable effect that filesystem cannot
be unmounted while there are outstanding events - a regression for
inotify compared to a situation before it was converted to fsnotify
framework. For fanotify this problem is hard to avoid and users of
fanotify should kind of expect this behavior when they ask for file
descriptors from notified files.
This patch changes fsnotify and its users to create separate event
structure for each group. This allows for much simpler code (~400 lines
removed by this patch) and also smaller event structures. For example
on 64-bit system original struct fsnotify_event consumes 120 bytes, plus
additional space for file name, additional 24 bytes for second and each
subsequent group linking the event, and additional 32 bytes for each
inotify group for private data. After the conversion inotify event
consumes 48 bytes plus space for file name which is considerably less
memory unless file names are long and there are several groups
interested in the events (both of which are uncommon). Fanotify event
fits in 56 bytes after the conversion (fanotify doesn't care about file
names so its events don't have to have it allocated). A win unless
there are four or more fanotify groups interested in the event.
The conversion also solves the problem with unmount when only inotify is
used as we don't have to grab path references for inotify events.
[hughd@google.com: fanotify: fix corruption preventing startup]
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rounding of name length when passing it to userspace was done in several
places. Provide a function to do it and use it in all places.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Eric Paris <eparis@parisplace.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Record actively mapped pages and provide an api for asserting a given
page is dma inactive before execution proceeds. Placing
debug_dma_assert_idle() in cow_user_page() flagged the violation of the
dma-api in the NET_DMA implementation (see commit 7787380336 "net_dma:
mark broken").
The implementation includes the capability to count, in a limited way,
repeat mappings of the same page that occur without an intervening
unmap. This 'overlap' counter is limited to the few bits of tag space
in a radix tree. This mechanism is added to mitigate false negative
cases where, for example, a page is dma mapped twice and
debug_dma_assert_idle() is called after the page is un-mapped once.
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: James Bottomley <JBottomley@Parallels.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
- Power supply notifier
- Several drivers gained DT support
- Added Maxim 14577 driver
- Change of maintainer
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJS3bUsAAoJEBTbcu2+gGW4uC8QAJFPpPqnEjz9NnFjbngyTswc
+Wq5kusHpdgIXCqx9+H26pm2NTyIm4uAcVn2XjYIvdKZbDvygNzMr8gPYpwWapJz
1K6ED1sZROhbvSKx0ADuGHz05Kwl78Kx6Qa+YtesGaKMaTxbJySWqh81zmYvN5Sq
SoSqWC866bJwL4BVFIi7qZzHuD4czMP+iI8nrDKm1OFpj94y6DzC+BNxruYRCieI
k+EEjTR6sKfQ0XFNxhyWSxxZtcxexVNvck7wsMleUUGysWHFjIu8HF01o9v58mt9
o/Tr8ZAHx8h0vPVtFF4ugnc+LZHwodqlcoFj1Q+5w3AsGYZUVMOIv2wY7gQZnRTj
UcDXC06fG1ODMuBc1IUZy4j1ABAk+xoNW7zylMdtpgNDTpYdF5Mfy/yuD1eu5wzT
SZyRpuCfoXEP1MDt+lcgBwakb4nh4HFgnzxbKWNwzKSsbJ/ZOHja5k4xe+aQ8yQX
VmifBL8WUSYj97UQVy44/9DxSI7hE4GN7oXBKN2G/an5inqceePNDl89dymywBi1
NpWGX7a2NcQpdP+EpwLPuMU/J8SC5rzPWgMttpablrFSY0NQYJhmL3AdEcROP5gf
cclLaT+UKgSTxOdAN4b8PIlMmffhJjvfBdxMsBaxe8PtLkRaXyXRcBkH/eDEsk7S
Jj+gZOgLoThqJSK/IbB3
=QMHd
-----END PGP SIGNATURE-----
Merge tag 'for-v3.14' of git://git.infradead.org/battery-2.6
Pull battery updates from Dmitry Eremin-Solenikov:
"I'm picking up power supply maintainership from Anton Vorontov. Could
you please pull battery-2.6 git tree changes prepared for the v3.14
release.
Highlights:
- Power supply notifier
- Several drivers gained DT support
- Added Maxim 14577 driver
- Change of maintainer"
* tag 'for-v3.14' of git://git.infradead.org/battery-2.6:
MAINTAINERS: Pick up power supply maintainership
max17042_battery: Add IRQF_ONESHOT flag to use default irq handler
gpio-charger: Support wakeup events
power_supply: Add charger support for Maxim 14577
dt: Binding documentation for isp1704 charger
isp1704_charger: Add DT support
charger-manager: of_cm_parse_desc() should be static
bq2415x_charger: Add DT support
power_supply: Add power_supply_get_by_phandle
bq2415x_charger: Use power_supply notifier for automode
power: reset: Add as3722 power-off driver
mfd: AS3722: Add dt node properties for system power controller
charger-manager: Support deivce tree in charger manager driver
charger-manager: Modify the way of checking battery's temperature
power_supply: Add power_supply notifier
New drivers
- Samsung Maxim 14577; Micro USB, Regulator, IRQ Controller and Battery Charger
- TI/National Semiconductor LP3943 I2C GPIO Expander and PWM Generator
Existing driver adaptions
- Expansion of Wolfson Arizona DSP and High-Pass filter controls
- TI TWL6040 default Regmap support and Regcache addition/bypass
- Some nice Smatch catch fixes
- Conversion of TI OMAP-USB and TI TWL6030 to endian neutralness
- ChromeOS EC timing (delay) adaptions and added dependency on OF
- Many constifications of 'struct {mfd_cell,regmap_irq,et. al}'
- Watchdog support added for NVIDIA AS3722
- Convert functions to static in TI AM335x
- Realigned previously defeated functionality in TI AM335x
- IIO ADC-TSC concurrency dead-lock/timeout resolution
- Addition of Power Management and Clock support for Samsung core
- DEFINE_PCI_DEVICE_TABLE macro removal from MFD Subsystem
- Greater use of irqdomain functionality in ST-E AB8500
- Removal of 'include/linux/mfd/abx500/ab8500-gpio.h'
- Wolfson WM831x PMIC Power Management changes s/poweroff/shutdown/
- Device Tree documentation added for TI/Nat Semi LP3943
- Version detection and voltage tables for TI TPS6586x PMIC devices
- Simplification of Freescale MC13XXX (de-)initialisation routines
- Clean-up and simplification of the Realtek parent driver
- Added support for RTL8402 Realtek PCI-Express card reader
- Resource leak fix for Maxim 77686
- Possible suspend BUG() fix in OMAP USB TLL
- Support for new Wolfson WM5110 Revision (D)
- Testing of automatic assignment of of_node in mfd_add_device()
- Reversion of the above when it started to cause issues
- Remove legacy Platform Data from;
TI TWL Core, Qualcomm SSBI and ST-E ABx500 Pinctrl
- Clean-ups; tabbing issues, function name changes, 'drvdata = NULL' removal,
unused uninitialised warning mitigation, error message clarity,
removal of redundant/duplicate checks, licensing (GPL -> GPL2),
coding consistency, duplicate function declaration, ret checks,
commit corrections, redundant of_match_ptr() helper removal,
spelling, #if-deffery removal and header guards name changes
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQIbBAABAgAGBQJS3pLGAAoJEFGvii+H/HdhmkkP93Hrd9FBjVpmUQcOrghFDd//
vte2LVDovXDcwm7i+BdZNG3+2aWtliTQXIw8PaAziUTwMlDNtT2B6GBFnIff4aXB
Em/Oh6Je7r1gom1gMPCuefRrInTk0xEXy9Oazp4Hn4in71T+8PHNlEHdxEojakEm
H5FnjAfgISEsA5twSyO9efVLNqPd3UQqg3O571oKwfuSED70YSCW2Yyaoiz4pnE5
0WwZ9cel+sP7CIuyuR4TumUSDeBIAnYnZWqjqXZ1ueMWcm2RNVqeFrt/w0uoZjOA
yBg8ZMfkBcePd6qnifqVqagRW/jW1bxmUeIHkp0bWeMqWN6Yyypitz8ZW+Qi7Swa
OcmgM9V7OW1WG9FF7HoLbYHIPzmBb6duGtcCfAir4m8HJjyPfTuJpOshBW1F3+VG
yEf5a1fj2NO34kvIbLec2f7MveIMmZxzWaoOx+ET9/WPknilifgyp7eDH24pQwI4
5Lo5Z5uAfBCT3roOzHxCLl2nVXQoC66iTwdnneiEOn4rB/ApjfGVvGGd0VT6TD+g
z3RqxpTdkd0AtjfeF778uTDBEKu7HZkqmlBP8HKWCBEAzqcKg7BpjYw0ajgmVwKr
QiuBuWcEZ/2vVt8Qot7y5Vx89Q4AQwOqc24SldtQLu46iPAuKt+GizzHRw3IxBiQ
VU9Aq/VoaTHBLS91tDE=
=PuTE
-----END PGP SIGNATURE-----
Merge tag 'mfd-3.14-1' of git://git.linaro.org/people/ljones/mfd
Pull MFD changes from Lee Jones:
"New drivers
- Samsung Maxim 14577; Micro USB, Regulator, IRQ Controller and
Battery Charger
- TI/National Semiconductor LP3943 I2C GPIO Expander and PWM
Generator
Existing driver adaptions
- Expansion of Wolfson Arizona DSP and High-Pass filter controls
- TI TWL6040 default Regmap support and Regcache addition/bypass
- Some nice Smatch catch fixes
- Conversion of TI OMAP-USB and TI TWL6030 to endian neutralness
- ChromeOS EC timing (delay) adaptions and added dependency on OF
- Many constifications of 'struct {mfd_cell,regmap_irq,et.al}'
- Watchdog support added for NVIDIA AS3722
- Convert functions to static in TI AM335x
- Realigned previously defeated functionality in TI AM335x
- IIO ADC-TSC concurrency dead-lock/timeout resolution
- Addition of Power Management and Clock support for Samsung core
- DEFINE_PCI_DEVICE_TABLE macro removal from MFD Subsystem
- Greater use of irqdomain functionality in ST-E AB8500
- Removal of 'include/linux/mfd/abx500/ab8500-gpio.h'
- Wolfson WM831x PMIC Power Management changes s/poweroff/shutdown/
- Device Tree documentation added for TI/Nat Semi LP3943
- Version detection and voltage tables for TI TPS6586x PMIC devices
- Simplification of Freescale MC13XXX (de-)initialisation routines
- Clean-up and simplification of the Realtek parent driver
- Added support for RTL8402 Realtek PCI-Express card reader
- Resource leak fix for Maxim 77686
- Possible suspend BUG() fix in OMAP USB TLL
- Support for new Wolfson WM5110 Revision (D)
- Testing of automatic assignment of of_node in mfd_add_device()
- Reversion of the above when it started to cause issues
- Remove legacy Platform Data from;
TI TWL Core, Qualcomm SSBI and ST-E ABx500 Pinctrl
- Clean-ups; tabbing issues, function name changes, 'drvdata = NULL'
removal, unused uninitialised warning mitigation, error
message clarity, removal of redundant/duplicate checks,
licensing (GPL -> GPL2), coding consistency, duplicate
function declaration, ret checks, commit corrections,
redundant of_match_ptr() helper removal, spelling,
#if-deffery removal and header guards name changes"
* tag 'mfd-3.14-1' of git://git.linaro.org/people/ljones/mfd: (78 commits)
mfd: wm5110: Add register patch for rev D chip
mfd: omap-usb-tll: Don't hold lock during pm_runtime_get/put_sync()
gpio: lp3943: Remove redundant of_match_ptr helper
mfd: sta2x11-mfd: Use named constants for pci_power_t values
Documentation: mfd: Fix LDO index in s2mps11.txt
mfd: Cleanup mfd-mcp-sa11x0.h header
mfd: max8997: Use "IS_ENABLED(CONFIG_OF)" for DT code.
mfd: twl6030: Fix endianness problem in IRQ handler
mfd: sec-core: Add cells for S5M8767-clocks
mfd: max14577: Remove redundant of_match_ptr helper
mfd: twl6040: Fix sparse non static symbol warning
mfd: Revert "mfd: Always assign of_node in mfd_add_device()"
mfd: rtsx: Fix sparse non static symbol warning
mfd: max77693: Set proper maximum register for MUIC regmap
mfd: max77686: Fix regmap resource leak on driver remove
mfd: Represent correct filenames in file headers
mfd: rtsx: Add support for card reader rtl8402
mfd: rtsx: Add set pull control macro and simplify rtl8411
mfd: max8997: Enforce mfd_add_devices() return value check
mfd: mc13xxx: Simplify probe() & remove()
...
It was holiday season, so no wonder that there are little changes in
framework level, although diffstat shows quite many changes spreaded
over sound/* directories. Most of changes are cleanups, code
refactoring and fixes.
Some highlights:
- Removal of OSS sleep_on usages by Arnd
- Simplified memalloc helper codes, drop obsoleted features;
now it's built into PCM driver instead of an individual module
- Warn if PCM buffer preallocation fails, which will show page
allocation issues more clearly
- Compress offload API updates for sample rates by Vinod
- PCM glitch workaround on ctxfi emu20k1 by Sarah
- Drop cs46xx DSP blobs, using firmware loader now
- USB-audio quitks for Plantronics Gamecom 780, Creative VF0420,
and Focusrite Saffire 6
HD-audio specifics:
- Standardize Kconfigs of HD-audio codec drivers;
now "make localmodconfig" recognizes configs properly (finally!)
- Parallel PM implementation by Mengdong
- BayleyBay/ValleyView2 board fixups
- Broadwell audio support
- Runtime PM improvement (PantherPoint, etc)
- Quirks: Dell subwooer, Gigabyte mobo jack detection oddity,
Dell AiO click noise fixes, Dell headset mic fixes, etc
- Automatic bind with HDMI codec parser without generic parser
- More AD codec fixes (since 3.12 regression) including the automatic
stereo mix support
- Common Thinkpad ACPI helper for Realtek and Conexant codecs
ASoC specifics:
- Update to the generic DMA code to support deferred probe and managed
resources
- New drivers for BCM2835 (used in Raspberry Pi), Tegra with MAX98090
and Analog Devices AXI I2S and S/PDIF controller IPs
- Device tree support for the simple card, max98090 and cs42l52
- Conversion of the Samsung drivers to native dmaengine, making them
multiplatform compatible and hopefully helping keep them more modern
and up to date.
- More regmap conversions, including a very welcome one for twl6040
from Peter Ujfalusi
- A big overhaul of the DaVinci drivers also from Peter Ujfalusi
- Lots of DMA updates from Lars-Peter
- Improvements to the constraints handling code from Lars-Peter
- A very helpful conversion of the TWL4030 driver to regmap from Peter
- A new driver for the Freescale ESAI controller from Nicolin Chen
- Conversion of some of the drivers to use params_width()
- Extensions to DPCM for use with compressed audio from Liam
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJS3ocNAAoJEGwxgFQ9KSmkZEcP/3ImC7SrUC8NbCJF51tm31jk
Ed6eKNEJPcVEbUkZ6QN215w8OkgGCXnxMge1Mhu5VGFJiIjmdtls297DqJAXUL3d
l3/icCeYK7BN7ocMyaFRnwUipRtSiI0YuuU5cIcdbAoSrI8gtc+yZES82ivjOY4i
cFs/owDc5nqTNtx414N14a0NQOTJU5gMNjOVqCno8VLsUUUlg3nEY0kyHKiBNe4O
EqYmHHS2L4xu3tqxmh8pvd82OUnO53rMJXRntU6nuw1sgY7o+4pjmi3k10l1SHpH
Rv0dB8PunY72AnxPMPORISKHJ3isKPz0QdZVtdj0+WrFUwD1vrFE3Yktq3hYQvwd
NrOcMoEosQQhDcD74SSt1xfKVpW+gJazN4154VK/O2PKdNHLdc3/4rw95tYA8ywO
/jnDH+qNlzHSrAdq2Fj21FfTKOYvow8RBB+Rp05RymKESwJ9sk0yLLN1QgDoJXwW
kNBH6SxMxhaHOch0PzXidErydxnB2+j59i485DmJlI1GsjJTMFZvm8f85OlNstrn
ad5OsXKfBkS78oT5vBeDkp5uH56hPiyP89lhnd+cE/MgjBHHnQliJ8thLumgUjN+
vf6VGnpJywacH/VYrA2v8xKj514Q0CgsEydYtq5Z4J6WQVKrECOYlgYgpbJt6dsC
WE9DYZECRVCp74N8WB+z
=IWy2
-----END PGP SIGNATURE-----
Merge tag 'sound-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"It was holiday season, so no wonder that there are little changes in
framework level, although diffstat shows quite many changes spreaded
over sound/* directories. Most of changes are cleanups, code
refactoring and fixes.
Some highlights:
- Removal of OSS sleep_on usages by Arnd
- Simplified memalloc helper codes, drop obsoleted features; now it's
built into PCM driver instead of an individual module
- Warn if PCM buffer preallocation fails, which will show page
allocation issues more clearly
- Compress offload API updates for sample rates by Vinod
- PCM glitch workaround on ctxfi emu20k1 by Sarah
- Drop cs46xx DSP blobs, using firmware loader now
- USB-audio quitks for Plantronics Gamecom 780, Creative VF0420, and
Focusrite Saffire 6
HD-audio specifics:
- Standardize Kconfigs of HD-audio codec drivers; now "make
localmodconfig" recognizes configs properly (finally!)
- Parallel PM implementation by Mengdong
- BayleyBay/ValleyView2 board fixups
- Broadwell audio support
- Runtime PM improvement (PantherPoint, etc)
- Quirks: Dell subwooer, Gigabyte mobo jack detection oddity, Dell
AiO click noise fixes, Dell headset mic fixes, etc
- Automatic bind with HDMI codec parser without generic parser
- More AD codec fixes (since 3.12 regression) including the automatic
stereo mix support
- Common Thinkpad ACPI helper for Realtek and Conexant codecs
ASoC specifics:
- Update to the generic DMA code to support deferred probe and
managed resources
- New drivers for BCM2835 (used in Raspberry Pi), Tegra with MAX98090
and Analog Devices AXI I2S and S/PDIF controller IPs
- Device tree support for the simple card, max98090 and cs42l52
- Conversion of the Samsung drivers to native dmaengine, making them
multiplatform compatible and hopefully helping keep them more
modern and up to date.
- More regmap conversions, including a very welcome one for twl6040
from Peter Ujfalusi
- A big overhaul of the DaVinci drivers also from Peter Ujfalusi
- Lots of DMA updates from Lars-Peter
- Improvements to the constraints handling code from Lars-Peter
- A very helpful conversion of the TWL4030 driver to regmap from Peter
- A new driver for the Freescale ESAI controller from Nicolin Chen
- Conversion of some of the drivers to use params_width()
- Extensions to DPCM for use with compressed audio from Liam"
* tag 'sound-3.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (396 commits)
ASoC: dapm: Fix double prefix addition
ASoC: compress: Add suport for DPCM into compressed audio
ASoC: DPCM: make some DPCM API calls non static for compressed usage
ASoC: core: Fix possible NULL pointer dereference of pcm->config
ALSA: hda - add headset mic detect quirks for some Dell machines
ASoC: tlv320aic32x4: Fix regmap range_min
ASoC: core: Return -ENOTSUPP from set_sysclk() if no operation provided
ASoC: dapm: Change prototype of soc_widget_read
ASoC: samsung: Remove SND_DMAENGINE_PCM_FLAG_NO_RESIDUE flag
ASoC: axi-{spdif,i2s}: Remove SND_DMAENGINE_PCM_FLAG_NO_RESIDUE flag
ASoC: generic-dmaengine-pcm: Check DMA residue granularity
ASoC: generic-dmaengine-pcm: Check NO_RESIDUE flag at runtime
dma: pl330: Set residue_granularity
dma: Indicate residue granularity in dma_slave_caps
ASoC: simple-card: fix one bug to writing to the platform data
ASoC: pcm: Use snd_pcm_rate_mask_intersect() helper
ALSA: Add helper function for intersecting two rate masks
ASoC: s6000: Don't mix SNDRV_PCM_RATE_CONTINUOUS with specific rates
ASoC: fsl: Don't mix SNDRV_PCM_RATE_CONTINUOUS with specific rates
ASoC: pcm: Properly initialize hw->rate_max
...
otherwise we will get for some user-space applications
that use 'clone' with CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID
end up hitting an assert in glibc manifested by:
general protection ip:7f80720d364c sp:7fff98fd8a80 error:0 in
libc-2.13.so[7f807209e000+180000]
This is due to the nature of said operations which sets and clears
the PID. "In the successful one I can see that the page table of
the parent process has been updated successfully to use a
different physical page, so the write of the tid on
that page only affects the child...
On the other hand, in the failed case, the write seems to happen before
the copy of the original page is done, so both the parent and the child
end up with the same value (because the parent copies the page after
the write of the child tid has already happened)."
(Roger's analysis). The nature of this is due to the Xen's commit
of 51e2cac257ec8b4080d89f0855c498cbbd76a5e5
"x86/pvh: set only minimal cr0 and cr4 flags in order to use paging"
the CR0_WP was removed so COW features of the Linux kernel were not
operating properly.
While doing that also update the rest of the CR0 flags to be inline
with what a baremetal Linux kernel would set them to.
In 'secondary_startup_64' (baremetal Linux) sets:
X86_CR0_PE | X86_CR0_MP | X86_CR0_ET | X86_CR0_NE | X86_CR0_WP |
X86_CR0_AM | X86_CR0_PG
The hypervisor for HVM type guests (which PVH is a bit) sets:
X86_CR0_PE | X86_CR0_ET | X86_CR0_TS
For PVH it specifically sets:
X86_CR0_PG
Which means we need to set the rest: X86_CR0_MP | X86_CR0_NE |
X86_CR0_WP | X86_CR0_AM to have full parity.
Signed-off-by: Roger Pau Monne <roger.pau@citrix.com>
Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
[v1: Took out the cr4 writes to be a seperate patch]
[v2: 0-DAY kernel found xen_setup_gdt to be missing a static]