Implement a wrapper around task_get_css() to acquire the blkcg css for
a given task. The wrapper is necessary for cgroup writeback support
as there will be places outside blkcg proper trying to acquire
blkcg_css and blkio_cgrp_id will be undefined when !CONFIG_BLK_CGROUP.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
bio_associate_current() currently open codes task_css() and
css_tryget_online() to find and pin $current's blkcg css. Abstract it
into task_get_css() which is implemented from cgroup side. As a task
is always associated with an online css for every subsystem except
while the css_set update is propagating, task_get_css() retries till
css_tryget_online() succeeds.
This is a cleanup and shouldn't lead to noticeable behavior changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add global constant blkcg_root_css which points to &blkcg_root.css.
This will be used by cgroup writeback support. If blkcg is disabled,
it's defined as ERR_PTR(-EINVAL).
v2: The declarations moved to include/linux/blk-cgroup.h as suggested
by Vivek.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Jens Axboe <axboe@fb.com>
Add global mem_cgroup_root_css which points to the root memcg css.
This will be used by cgroup writeback support. If memcg is disabled,
it's defined as ERR_PTR(-EINVAL).
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
aCc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
The header file will be used more widely with the pending cgroup
writeback support and the current set of dummy declarations aren't
enough to handle different config combinations. Update as follows.
* Drop the struct cgroup declaration. None of the dummy defs need it.
* Define blkcg as an empty struct instead of just declaring it.
* Wrap dummy function defs in CONFIG_BLOCK. Some functions use block
data types and none of them are to be used w/o block enabled.
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
cgroup aware writeback support will require exposing some of blkcg
details. In preprataion, move block/blk-cgroup.h to
include/linux/blk-cgroup.h. This patch is pure file move.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
When modifying PG_Dirty on cached file pages, update the new
MEM_CGROUP_STAT_DIRTY counter. This is done in the same places where
global NR_FILE_DIRTY is managed. The new memcg stat is visible in the
per memcg memory.stat cgroupfs file. The most recent past attempt at
this was http://thread.gmane.org/gmane.linux.kernel.cgroups/8632
The new accounting supports future efforts to add per cgroup dirty
page throttling and writeback. It also helps an administrator break
down a container's memory usage and provides evidence to understand
memcg oom kills (the new dirty count is included in memcg oom kill
messages).
The ability to move page accounting between memcg
(memory.move_charge_at_immigrate) makes this accounting more
complicated than the global counter. The existing
mem_cgroup_{begin,end}_page_stat() lock is used to serialize move
accounting with stat updates.
Typical update operation:
memcg = mem_cgroup_begin_page_stat(page)
if (TestSetPageDirty()) {
[...]
mem_cgroup_update_page_stat(memcg)
}
mem_cgroup_end_page_stat(memcg)
Summary of mem_cgroup_end_page_stat() overhead:
- Without CONFIG_MEMCG it's a no-op
- With CONFIG_MEMCG and no inter memcg task movement, it's just
rcu_read_lock()
- With CONFIG_MEMCG and inter memcg task movement, it's
rcu_read_lock() + spin_lock_irqsave()
A memcg parameter is added to several routines because their callers
now grab mem_cgroup_begin_page_stat() which returns the memcg later
needed by for mem_cgroup_update_page_stat().
Because mem_cgroup_begin_page_stat() may disable interrupts, some
adjustments are needed:
- move __mark_inode_dirty() from __set_page_dirty() to its caller.
__mark_inode_dirty() locking does not want interrupts disabled.
- use spin_lock_irqsave(tree_lock) rather than spin_lock_irq() in
__delete_from_page_cache(), replace_page_cache_page(),
invalidate_complete_page2(), and __remove_mapping().
text data bss dec hex filename
8925147 1774832 1785856 12485835 be84cb vmlinux-!CONFIG_MEMCG-before
8925339 1774832 1785856 12486027 be858b vmlinux-!CONFIG_MEMCG-after
+192 text bytes
8965977 1784992 1785856 12536825 bf4bf9 vmlinux-CONFIG_MEMCG-before
8966750 1784992 1785856 12537598 bf4efe vmlinux-CONFIG_MEMCG-after
+773 text bytes
Performance tests run on v4.0-rc1-36-g4f671fe2f952. Lower is better for
all metrics, they're all wall clock or cycle counts. The read and write
fault benchmarks just measure fault time, they do not include I/O time.
* CONFIG_MEMCG not set:
baseline patched
kbuild 1m25.030000(+-0.088% 3 samples) 1m25.426667(+-0.120% 3 samples)
dd write 100 MiB 0.859211561 +-15.10% 0.874162885 +-15.03%
dd write 200 MiB 1.670653105 +-17.87% 1.669384764 +-11.99%
dd write 1000 MiB 8.434691190 +-14.15% 8.474733215 +-14.77%
read fault cycles 254.0(+-0.000% 10 samples) 253.0(+-0.000% 10 samples)
write fault cycles 2021.2(+-3.070% 10 samples) 1984.5(+-1.036% 10 samples)
* CONFIG_MEMCG=y root_memcg:
baseline patched
kbuild 1m25.716667(+-0.105% 3 samples) 1m25.686667(+-0.153% 3 samples)
dd write 100 MiB 0.855650830 +-14.90% 0.887557919 +-14.90%
dd write 200 MiB 1.688322953 +-12.72% 1.667682724 +-13.33%
dd write 1000 MiB 8.418601605 +-14.30% 8.673532299 +-15.00%
read fault cycles 266.0(+-0.000% 10 samples) 266.0(+-0.000% 10 samples)
write fault cycles 2051.7(+-1.349% 10 samples) 2049.6(+-1.686% 10 samples)
* CONFIG_MEMCG=y non-root_memcg:
baseline patched
kbuild 1m26.120000(+-0.273% 3 samples) 1m25.763333(+-0.127% 3 samples)
dd write 100 MiB 0.861723964 +-15.25% 0.818129350 +-14.82%
dd write 200 MiB 1.669887569 +-13.30% 1.698645885 +-13.27%
dd write 1000 MiB 8.383191730 +-14.65% 8.351742280 +-14.52%
read fault cycles 265.7(+-0.172% 10 samples) 267.0(+-0.000% 10 samples)
write fault cycles 2070.6(+-1.512% 10 samples) 2084.4(+-2.148% 10 samples)
As expected anon page faults are not affected by this patch.
tj: Updated to apply on top of the recent cancel_dirty_page() changes.
Signed-off-by: Sha Zhengju <handai.szj@gmail.com>
Signed-off-by: Greg Thelen <gthelen@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
cancel_dirty_page() had some issues and b9ea25152e ("page_writeback:
clean up mess around cancel_dirty_page()") replaced it with
account_page_cleaned() which makes the caller responsible for clearing
the dirty bit; unfortunately, the planned changes for cgroup writeback
support requires synchronization between dirty bit manipulation and
stat updates. While we can open-code such synchronization in each
account_page_cleaned() callsite, that's gonna be unnecessarily awkward
and verbose.
This patch revives cancel_dirty_page() but in a more restricted form.
All it does is TestClearPageDirty() followed by account_page_cleaned()
invocation if the page was dirty. This helper covers all
account_page_cleaned() usages except for __delete_from_page_cache()
which is a special case anyway and left alone. As this leaves no
module user for account_page_cleaned(), EXPORT_SYMBOL() is dropped
from it.
This patch just revives cancel_dirty_page() as a trivial wrapper to
replace equivalent usages and doesn't introduce any functional
changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Jens Axboe <axboe@fb.com>
Storage controllers may expose multiple block devices that share hardware
resources managed by blk-mq. This patch enhances the shared tags so a
low-level driver can access the shared resources not tied to the unshared
h/w contexts. This way the LLD can dynamically add and delete disks and
request queues without having to track all the request_queue hctx's to
iterate outstanding tags.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Currently dm-multipath has to clone the bios for every request sent
to the lower devices, which wastes cpu cycles and ties down memory.
This patch instead adds a new REQ_CLONE flag that instructs req_bio_endio
to not complete bios attached to a request, which we set on clone
requests similar to bios in a flush sequence. With this change I/O
errors on a path failure only get propagated to dm-multipath, which
can then either resubmit the I/O or complete the bios on the original
request.
I've done some basic testing of this on a Linux target with ALUA support,
and it survives path failures during I/O nicely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Commit c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for
non-chains") regressed all existing callers that followed this pattern:
1) saving a bio's original bi_end_io
2) wiring up an intermediate bi_end_io
3) restoring the original bi_end_io from intermediate bi_end_io
4) calling bio_endio() to execute the restored original bi_end_io
The regression was due to BIO_CHAIN only ever getting set if
bio_inc_remaining() is called. For the above pattern it isn't set until
step 3 above (step 2 would've needed to establish BIO_CHAIN). As such
the first bio_endio(), in step 2 above, never decremented __bi_remaining
before calling the intermediate bi_end_io -- leaving __bi_remaining with
the value 1 instead of 0. When bio_inc_remaining() occurred during step
3 it brought it to a value of 2. When the second bio_endio() was
called, in step 4 above, it should've called the original bi_end_io but
it didn't because there was an extra reference that wasn't dropped (due
to atomic operations being optimized away since BIO_CHAIN wasn't set
upfront).
Fix this issue by removing the __bi_remaining management complexity for
all callers that use the above pattern -- bio_chain() is the only
interface that _needs_ to be concerned with __bi_remaining. For the
above pattern callers just expect the bi_end_io they set to get called!
Remove bio_endio_nodec() and also remove all bio_inc_remaining() calls
that aren't associated with the bio_chain() interface.
Also, the bio_inc_remaining() interface has been moved local to bio.c.
Fixes: c4cf5261 ("bio: skip atomic inc/dec of ->bi_remaining for non-chains")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This patch exports blkdev_reread_part() for block drivers, also
introduce __blkdev_reread_part().
For some drivers, such as loop, reread of partitions can be run
from the release path, and bd_mutex may already be held prior to
calling ioctl_by_bdev(bdev, BLKRRPART, 0), so introduce
__blkdev_reread_part for use in such cases.
CC: Christoph Hellwig <hch@lst.de>
CC: Jens Axboe <axboe@kernel.dk>
CC: Tejun Heo <tj@kernel.org>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Stefan Weinhuber <wein@de.ibm.com>
CC: Stefan Haberland <stefan.haberland@de.ibm.com>
CC: Sebastian Ott <sebott@linux.vnet.ibm.com>
CC: Fabian Frederick <fabf@skynet.be>
CC: Ming Lei <ming.lei@canonical.com>
CC: David Herrmann <dh.herrmann@gmail.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Peter Zijlstra <peterz@infradead.org>
CC: nbd-general@lists.sourceforge.net
CC: linux-s390@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Stop abusing struct page functionality and the swap end_io handler, and
instead add a modified version of the blk-lib.c bio_batch helpers.
Also move the block I/O code into swap.c as they are directly tied into
each other.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Pavel Machek <pavel@ucw.cz>
Tested-by: Ming Lin <mlin@kernel.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
Signed-off-by: Jens Axboe <axboe@fb.com>
Various previous patches removed bits and left holes, collapse them
all. Leave the reset start bit where it is, we don't need to change
that.
Signed-off-by: Jens Axboe <axboe@fb.com>
Since the big barrier rewrite/removal in 2007 we never fail FLUSH or
FUA requests, which means we can remove the magic BIO_EOPNOTSUPP flag
to help propagating those to the buffer_head layer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
lockdep gets unhappy about the not disabling irqs when using the queue_lock
around it. Instead of trying to fix that up just switch to an atomic_t
and get rid of the lock.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
This removes the request types and hacks from the block code and into the
old IDE driver. There is a small amunt of code duplication due to this,
but it's not too bad.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
These values are only used by the IDE driver, so move them into it
by allowing drivers to take cmd_type values after the first private
one. Note that we have to turn cmd_type into a plain unsigned integer
so that gcc doesn't complain about mismatching enum types.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Struct bio has a reference count that controls when it can be freed.
Most uses cases is allocating the bio, which then returns with a
single reference to it, doing IO, and then dropping that single
reference. We can remove this atomic_dec_and_test() in the completion
path, if nobody else is holding a reference to the bio.
If someone does call bio_get() on the bio, then we flag the bio as
now having valid count and that we must properly honor the reference
count when it's being put.
Tested-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Struct bio has an atomic ref count for chained bio's, and we use this
to know when to end IO on the bio. However, most bio's are not chained,
so we don't need to always introduce this atomic operation as part of
ending IO.
Add a helper to elevate the bi_remaining count, and flag the bio as
now actually needing the decrement at end_io time. Rename the field
to __bi_remaining to catch any current users of this doing the
incrementing manually.
For high IOPS workloads, this reduces the overhead of bio_endio()
substantially.
Tested-by: Robert Elliott <elliott@hp.com>
Acked-by: Kent Overstreet <kent.overstreet@gmail.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
Pull crypto fixes from Herbert Xu:
"This fixes a build problem with bcm63xx and yet another fix to the
memzero_explicit function to ensure that the memset is not elided"
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
hwrng: bcm63xx - Fix driver compilation
lib: make memzero_explicit more robust against dead store elimination
In commit 0b053c9518 ("lib: memzero_explicit: use barrier instead
of OPTIMIZER_HIDE_VAR"), we made memzero_explicit() more robust in
case LTO would decide to inline memzero_explicit() and eventually
find out it could be elimiated as dead store.
While using barrier() works well for the case of gcc, recent efforts
from LLVMLinux people suggest to use llvm as an alternative to gcc,
and there, Stephan found in a simple stand-alone user space example
that llvm could nevertheless optimize and thus elimitate the memset().
A similar issue has been observed in the referenced llvm bug report,
which is regarded as not-a-bug.
Based on some experiments, icc is a bit special on its own, while it
doesn't seem to eliminate the memset(), it could do so with an own
implementation, and then result in similar findings as with llvm.
The fix in this patch now works for all three compilers (also tested
with more aggressive optimization levels). Arguably, in the current
kernel tree it's more of a theoretical issue, but imho, it's better
to be pedantic about it.
It's clearly visible with gcc/llvm though, with the below code: if we
would have used barrier() only here, llvm would have omitted clearing,
not so with barrier_data() variant:
static inline void memzero_explicit(void *s, size_t count)
{
memset(s, 0, count);
barrier_data(s);
}
int main(void)
{
char buff[20];
memzero_explicit(buff, sizeof(buff));
return 0;
}
$ gcc -O2 test.c
$ gdb a.out
(gdb) disassemble main
Dump of assembler code for function main:
0x0000000000400400 <+0>: lea -0x28(%rsp),%rax
0x0000000000400405 <+5>: movq $0x0,-0x28(%rsp)
0x000000000040040e <+14>: movq $0x0,-0x20(%rsp)
0x0000000000400417 <+23>: movl $0x0,-0x18(%rsp)
0x000000000040041f <+31>: xor %eax,%eax
0x0000000000400421 <+33>: retq
End of assembler dump.
$ clang -O2 test.c
$ gdb a.out
(gdb) disassemble main
Dump of assembler code for function main:
0x00000000004004f0 <+0>: xorps %xmm0,%xmm0
0x00000000004004f3 <+3>: movaps %xmm0,-0x18(%rsp)
0x00000000004004f8 <+8>: movl $0x0,-0x8(%rsp)
0x0000000000400500 <+16>: lea -0x18(%rsp),%rax
0x0000000000400505 <+21>: xor %eax,%eax
0x0000000000400507 <+23>: retq
End of assembler dump.
As gcc, clang, but also icc defines __GNUC__, it's sufficient to define
this in compiler-gcc.h only to be picked up. For a fallback or otherwise
unsupported compiler, we define it as a barrier. Similarly, for ecc which
does not support gcc inline asm.
Reference: https://llvm.org/bugs/show_bug.cgi?id=15495
Reported-by: Stephan Mueller <smueller@chronox.de>
Tested-by: Stephan Mueller <smueller@chronox.de>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Stephan Mueller <smueller@chronox.de>
Cc: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: mancha security <mancha1@zoho.com>
Cc: Mark Charlebois <charlebm@gmail.com>
Cc: Behan Webster <behanw@converseincode.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Pull networking fixes from David Miller:
1) Receive packet length needs to be adjust by 2 on RX to accomodate
the two padding bytes in altera_tse driver. From Vlastimil Setka.
2) If rx frame is dropped due to out of memory in macb driver, we leave
the receive ring descriptors in an undefined state. From Punnaiah
Choudary Kalluri
3) Some netlink subsystems erroneously signal NLM_F_MULTI. That is
only for dumps. Fix from Nicolas Dichtel.
4) Fix mis-use of raw rt->rt_pmtu value in ipv4, one must always go via
the ipv4_mtu() helper. From Herbert Xu.
5) Fix null deref in bridge netfilter, and miscalculated lengths in
jump/goto nf_tables verdicts. From Florian Westphal.
6) Unhash ping sockets properly.
7) Software implementation of BPF divide did 64/32 rather than 64/64
bit divide. The JITs got it right. Fix from Alexei Starovoitov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (30 commits)
ipv4: Missing sk_nulls_node_init() in ping_unhash().
net: fec: Fix RGMII-ID mode
net/mlx4_en: Schedule napi when RX buffers allocation fails
netxen_nic: use spin_[un]lock_bh around tx_clean_lock
net/mlx4_core: Fix unaligned accesses
mlx4_en: Use correct loop cursor in error path.
cxgb4: Fix MC1 memory offset calculation
bnx2x: Delay during kdump load
net: Fix Kernel Panic in bonding driver debugfs file: rlb_hash_table
net: dsa: Fix scope of eeprom-length property
net: macb: Fix race condition in driver when Rx frame is dropped
hv_netvsc: Fix a bug in netvsc_start_xmit()
altera_tse: Correct rx packet length
mlx4: Fix tx ring affinity_mask creation
tipc: fix problem with parallel link synchronization mechanism
tipc: remove wrong use of NLM_F_MULTI
bridge/nl: remove wrong use of NLM_F_MULTI
bridge/mdb: remove wrong use of NLM_F_MULTI
net: sched: act_connmark: don't zap skb->nfct
trivial: net: systemport: bcmsysport.h: fix 0x0x prefix
...
pvclock read; instead use the correct protocol in KVM.
This removes the need for task migration notifiers in core
scheduler code.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJVQkUWAAoJEL/70l94x66DhfcH/A8RTHUOELtoy+v2weahn21m
FFWEnEUlCWzYgmiddgFdlr6+ub386W3ryFsXKPqjrn/8LVv3yS7tK1NJF8d03LQw
n7HtIsrF01E9UI8CIWO4S/mUxWQev6vEJ9NXtNrsJcRmhSeLaIZkPjTH8Zqyx4i9
ZvG4731WHXmxvbJ03bfJU9Y8OwHXe55GMi614aTxPndVBGdvIRu2Oj6aTfQTeab/
7tEujub0MKWp74a7eyNU4GItcvIAXZCQt2wMc5dN1VK3ma5FTOnHIOuhAb8mACFF
qEeGhtxAnOf7W+s9J8i7zVBdA5MOS0vUKng361ZOVGDb0OLqcVADW7GpuTZfRAM=
=2A7v
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm changes from Paolo Bonzini:
"Remove from guest code the handling of task migration during a pvclock
read; instead use the correct protocol in KVM.
This removes the need for task migration notifiers in core scheduler
code"
[ The scheduler people really hated the migration notifiers, so this was
kind of required - Linus ]
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
x86: pvclock: Really remove the sched notifier for cross-cpu migrations
kvm: x86: fix kvmclock update protocol
Here are some small tty/serial driver fixes for 4.1-rc2.
They include some minor fixes that resolve reported issues, and a new
device quirk.
All have been in linux-next succesfully.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlVCMgMACgkQMUfUDdst+ylmRwCgzADm9JPmMS7DX0g21mfVSeQK
nI0AoIiYm3HHxu7wma7o3DowGLvuScwt
=8lGO
-----END PGP SIGNATURE-----
Merge tag 'tty-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Pull tty/serial fixes from Greg KH:
"Here are some small tty/serial driver fixes for 4.1-rc2.
They include some minor fixes that resolve reported issues, and a new
device quirk.
All have been in linux-next succesfully"
* tag 'tty-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
serial: 8250_pci: Add support for 16 port Exar boards
serial: samsung: fix serial console break
tty/serial: at91: maxburst was missing for dma transfers
serial: of-serial: Remove device_type = "serial" registration
serial: xilinx: Use platform_get_irq to get irq description structure
serial: core: Fix kernel-doc build warnings
tty: Re-add external interface for tty_set_termios()
Here are a number of small USB fixes for 4.2-rc2. They revert one
problem patch, fix some minor things, and add some new quirks for
"broken" devices.
All have been in linux-next successfully.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iEYEABECAAYFAlVCMNMACgkQMUfUDdst+ylQJwCgsGHQVK4YgrIOCpIkXoc+riy1
VWkAnip86mUGKRej4jrrRvTGvm3maeTj
=/oWf
-----END PGP SIGNATURE-----
Merge tag 'usb-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of small USB fixes for 4.2-rc2. They revert one
problem patch, fix some minor things, and add some new quirks for
"broken" devices.
All have been in linux-next successfully"
* tag 'usb-4.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
cdc-acm: prevent infinite loop when parsing CDC headers.
Revert "usb: host: ehci-msm: Use devm_ioremap_resource instead of devm_ioremap"
usb: chipidea: otg: remove mutex unlock and lock while stop and start role
uas: Set max_sectors_240 quirk for ASM1053 devices
uas: Add US_FL_MAX_SECTORS_240 flag
uas: Allow uas_use_uas_driver to return usb-storage flags
NLM_F_MULTI must be used only when a NLMSG_DONE message is sent. In fact,
it is sent only at the end of a dump.
Libraries like libnl will wait forever for NLMSG_DONE.
Fixes: e5a55a8987 ("net: create generic bridge ops")
Fixes: 815cccbf10 ("ixgbe: add setlink, getlink support to ixgbe and ixgbevf")
CC: John Fastabend <john.r.fastabend@intel.com>
CC: Sathya Perla <sathya.perla@emulex.com>
CC: Subbu Seetharaman <subbu.seetharaman@emulex.com>
CC: Ajit Khaparde <ajit.khaparde@emulex.com>
CC: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
CC: intel-wired-lan@lists.osuosl.org
CC: Jiri Pirko <jiri@resnulli.us>
CC: Scott Feldman <sfeldma@gmail.com>
CC: Stephen Hemminger <stephen@networkplumber.org>
CC: bridge@lists.linux-foundation.org
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull s390 updates from Martin Schwidefsky:
"One additional new feature for 4.1, a new PRNG based on SHA-512 for
the zcrypt driver.
Two memory management related changes, the page table reallocation for
KVM is removed, and with file ptes gone the encoding of page table
entries is improved.
And three bug fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: Introduce new SHA-512 based Pseudo Random Generator.
s390/mm: change swap pte encoding and pgtable cleanup
s390/mm: correct transfer of dirty & young bits in __pmd_to_pte
s390/bpf: add dependency to z196 features
s390/3215: free memory in error path
s390/kvm: remove delayed reallocation of page tables for KVM
kexec: allocate the kexec control page with KEXEC_CONTROL_MEMORY_GFP
This is needed by Bluetooth hci_uart module to be able to change speed
of Bluetooth controller and local UART.
Signed-off-by: Frederic Danis <frederic.danis@linux.intel.com>
Reviewed-by: Peter Hurley <peter@hurleysoftware.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The usb-storage driver sets max_sectors = 240 in its scsi-host template,
for uas we do not want to do that for all devices, but testing has shown
that some devices need it.
This commit adds a US_FL_MAX_SECTORS_240 flag for such devices, and
implements support for it in uas.c, while at it it also adds support
for US_FL_MAX_SECTORS_64 to uas.c.
Cc: stable@vger.kernel.org # 3.16
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Fix a crash in nf_tables when dictionaries are used from the ruleset,
due to memory corruption, from Florian Westphal.
2) Fix another crash in nf_queue when used with br_netfilter. Also from
Florian.
Both fixes are related to new stuff that got in 4.0-rc.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) mlx4 doesn't check fully for supported valid RSS hash function, fix
from Amir Vadai
2) Off by one in ibmveth_change_mtu(), from David Gibson
3) Prevent altera chip from reporting false error interrupts in some
circumstances, from Chee Nouk Phoon
4) Get rid of that stupid endless loop trying to allocate a FIN packet
in TCP, and in the process kill deadlocks. From Eric Dumazet
5) Fix get_rps_cpus() crash due to wrong invalid-cpu value, also from
Eric Dumazet
6) Fix two bugs in async rhashtable resizing, from Thomas Graf
7) Fix topology server listener socket namespace bug in TIPC, from Ying
Xue
8) Add some missing HAS_DMA kconfig dependencies, from Geert
Uytterhoeven
9) bgmac driver intends to force re-polling but does so by returning
the wrong value from it's ->poll() handler. Fix from Rafał Miłecki
10) When the creater of an rhashtable configures a max size for it,
don't bark in the logs and drop insertions when that is exceeded.
Fix from Johannes Berg
11) Recover from out of order packets in ppp mppe properly, from Sylvain
Rochet
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
bnx2x: really disable TPA if 'disable_tpa' option is set
net:treewide: Fix typo in drivers/net
net/mlx4_en: Prevent setting invalid RSS hash function
mdio-mux-gpio: use new gpiod_get_array and gpiod_put_array functions
netfilter; Add some missing default cases to switch statements in nft_reject.
ppp: mppe: discard late packet in stateless mode
ppp: mppe: sanity error path rework
net/bonding: Make DRV macros private
net: rfs: fix crash in get_rps_cpus()
altera tse: add support for fixed-links.
pxa168: fix double deallocation of managed resources
net: fix crash in build_skb()
net: eth: altera: Resolve false errors from MSGDMA to TSE
ehea: Fix memory hook reference counting crashes
net/tg3: Release IRQs on permanent error
net: mdio-gpio: support access that may sleep
inet: fix possible panic in reqsk_queue_unlink()
rhashtable: don't attempt to grow when at max_size
bgmac: fix requests for extra polling calls from NAPI
tcp: avoid looping in tcp_send_fin()
...
This reverts commits 0a4e6be9ca
and 80f7fdb1c7.
The task migration notifier was originally introduced in order to support
the pvclock vsyscall with non-synchronized TSC, but KVM only supports it
with synchronized TSC. Hence, on KVM the race condition is only needed
due to a bad implementation on the host side, and even then it's so rare
that it's mostly theoretical.
As far as KVM is concerned it's possible to fix the host, avoiding the
additional complexity in the vDSO and the (re)introduction of the task
migration notifier.
Xen, on the other hand, hasn't yet implemented vsyscall support at
all, so we do not care about its plans for non-synchronized TSC.
Reported-by: Peter Zijlstra <peterz@infradead.org>
Suggested-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pull intel iommu updates from David Woodhouse:
"This lays a little of the groundwork for upcoming Shared Virtual
Memory support — fixing some bogus #defines for capability bits and
adding the new ones, and starting to use the new wider page tables
where we can, in anticipation of actually filling in the new fields
therein.
It also allows graphics devices to be assigned to VM guests again.
This got broken in 3.17 by disallowing assignment of RMRR-afflicted
devices. Like USB, we do understand why there's an RMRR for graphics
devices — and unlike USB, it's actually sane. So we can make an
exception for graphics devices, just as we do USB controllers.
Finally, tone down the warning about the X2APIC_OPT_OUT bit, due to
persistent requests. X2APIC_OPT_OUT was added to the spec as a nasty
hack to allow broken BIOSes to forbid us from using X2APIC when they
do stupid and invasive things and would break if we did.
Someone noticed that since Windows doesn't have full IOMMU support for
DMA protection, setting the X2APIC_OPT_OUT bit made Windows avoid
initialising the IOMMU on the graphics unit altogether.
This means that it would be available for use in "driver mode", where
the IOMMU registers are made available through a BAR of the graphics
device and the graphics driver can do SVM all for itself.
So they started setting the X2APIC_OPT_OUT bit on *all* platforms with
SVM capabilities. And even the platforms which *might*, if the
planets had been aligned correctly, possibly have had SVM capability
but which in practice actually don't"
* git://git.infradead.org/intel-iommu:
iommu/vt-d: support extended root and context entries
iommu/vt-d: Add new extended capabilities from v2.3 VT-d specification
iommu/vt-d: Allow RMRR on graphics devices too
iommu/vt-d: Print x2apic opt out info instead of printing a warning
iommu/vt-d: kill bogus ecap_niotlb_iunits()
Highlights include:
Stable patches:
- Fix a regression in /proc/self/mountstats
- Fix the pNFS flexfiles O_DIRECT support
- Fix high load average due to callback thread sleeping
Bugfixes:
- Various patches to fix the pNFS layoutcommit support
- Do not cache pNFS deviceids unless server notifications are enabled
- Fix a SUNRPC transport reconnection regression
- make debugfs file creation failure non-fatal in SUNRPC
- Another fix for circular directory warnings on NFSv4 "junctioned" mountpoints
- Fix locking around NFSv4.2 fallocate() support
- Truncating NFSv4 file opens should also sync O_DIRECT writes
- Prevent infinite loop in rpcrdma_ep_create()
Features:
- Various improvements to the RDMA transport code's handling of memory
registration
- Various code cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVOmT6AAoJEGcL54qWCgDyrhYQAMPKXB55jrdOR/7UVSF/xPML
7OjMGHvBnTn/y0pNIyLyS1PjTZZsD/WZjoW9EFGpTv727qQNVoFxFRLNUcgi3NoL
1YledCkLf7Q32aqod93SRRFPc9hzBoKhOZpOzBuWaAviyAB3KLi70DWAq9qRReYM
prXUQQjpW5FLU+B2ifaVc2RCnu/rZ2c02YdR2XdtkBaAJxuhB2vR8IY1evwjCv3R
5zyLDd9zSDDoArdpUzM97cxZPcYRSqbOwgTKvaaRnDDq/mKbKMZaqmEfjblwzNFt
b43FbveJzZ3hlPADIpmaiMHjRTbxWjIKc9K1sOF2FPfcuPe2yM3DMAxDegUkEveS
7fkbv/qRZ30NqfchGanX/pmBlLOcdI76qe/bwhN19wCnw48O1eeHi1HK8rWGhU+E
qcrRZ3ZS2ufP/MVBuhauy0qU9Q4wcEtm7NGGP1231ZtmfjHKyBa4pLirNfG1AlJt
dK7tBrknVx+WVm/UddJp/fEsxbP0+fki6TwzioHUSWcz8rDVYF6PFT/QPM54SX2h
0oqwvu6d/uShpkVRm+fbje8FHmUxKdgqDsCYX2fNjWskh1oXSPsItvjqmTmTlE0i
EBmBwVwI0uB1ZQ3PrJLadhRcO3ZJmLQ5gNj456dstvWy6UQds1xyIQ/DgvmlzxWO
E9t0l18xHGRwbndsDa8f
=j5dP
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Another set of mainly bugfixes and a couple of cleanups. No new
functionality in this round.
Highlights include:
Stable patches:
- Fix a regression in /proc/self/mountstats
- Fix the pNFS flexfiles O_DIRECT support
- Fix high load average due to callback thread sleeping
Bugfixes:
- Various patches to fix the pNFS layoutcommit support
- Do not cache pNFS deviceids unless server notifications are enabled
- Fix a SUNRPC transport reconnection regression
- make debugfs file creation failure non-fatal in SUNRPC
- Another fix for circular directory warnings on NFSv4 "junctioned"
mountpoints
- Fix locking around NFSv4.2 fallocate() support
- Truncating NFSv4 file opens should also sync O_DIRECT writes
- Prevent infinite loop in rpcrdma_ep_create()
Features:
- Various improvements to the RDMA transport code's handling of
memory registration
- Various code cleanups"
* tag 'nfs-for-4.1-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (55 commits)
fs/nfs: fix new compiler warning about boolean in switch
nfs: Remove unneeded casts in nfs
NFS: Don't attempt to decode missing directory entries
Revert "nfs: replace nfs_add_stats with nfs_inc_stats when add one"
NFS: Rename idmap.c to nfs4idmap.c
NFS: Move nfs_idmap.h into fs/nfs/
NFS: Remove CONFIG_NFS_V4 checks from nfs_idmap.h
NFS: Add a stub for GETDEVICELIST
nfs: remove WARN_ON_ONCE from nfs_direct_good_bytes
nfs: fix DIO good bytes calculation
nfs: Fetch MOUNTED_ON_FILEID when updating an inode
sunrpc: make debugfs file creation failure non-fatal
nfs: fix high load average due to callback thread sleeping
NFS: Reduce time spent holding the i_mutex during fallocate()
NFS: Don't zap caches on fallocate()
xprtrdma: Make rpcrdma_{un}map_one() into inline functions
xprtrdma: Handle non-SEND completions via a callout
xprtrdma: Add "open" memreg op
xprtrdma: Add "destroy MRs" memreg op
xprtrdma: Add "reset MRs" memreg op
...
Pull fourth vfs update from Al Viro:
"d_inode() annotations from David Howells (sat in for-next since before
the beginning of merge window) + four assorted fixes"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
RCU pathwalk breakage when running into a symlink overmounting something
fix I_DIO_WAKEUP definition
direct-io: only inc/dec inode->i_dio_count for file systems
fs/9p: fix readdir()
VFS: assorted d_backing_inode() annotations
VFS: fs/inode.c helpers: d_inode() annotations
VFS: fs/cachefiles: d_backing_inode() annotations
VFS: fs library helpers: d_inode() annotations
VFS: assorted weird filesystems: d_inode() annotations
VFS: normal filesystems (and lustre): d_inode() annotations
VFS: security/: d_inode() annotations
VFS: security/: d_backing_inode() annotations
VFS: net/: d_inode() annotations
VFS: net/unix: d_backing_inode() annotations
VFS: kernel/: d_inode() annotations
VFS: audit: d_backing_inode() annotations
VFS: Fix up some ->d_inode accesses in the chelsio driver
VFS: Cachefiles should perform fs modifications on the top layer only
VFS: AF_UNIX sockets should call mknod on the top layer only
Here's a set of updates to the Chrome OS platform drivers for this merge window.
Main new things this cycle is:
- Driver changes to expose the lightbar to users. With this, you can make your
own blinkenlights on Chromebook Pixels.
- Changes in the way that the atmel_mxt trackpads are probed. The laptop driver
is trying to be smart and not instantiate the devices that don't answer to
probe. For the trackpad that can come up in two modes (bootloader or regular),
this gets complicated since the driver already knows how to handle the two
modes including the actual addresses used. So now the laptop driver needs to
know more too, instantiating the regular address even if the bootloader one
is the probe that passed.
- mfd driver improvements by Javier Martines Canillas, and a few bugfixes
from him, kbuild and myself.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVOyVhAAoJEIwa5zzehBx3U/gP/jEqIMKEB6r0qApnYLU/0v2V
6AiAtQBDZ6PSNDOqy5Mo5HoMQ0WI09n4xvml3Ntmx0/584RGECn8nlFvwlowIxNo
FLGYcKWuy8w8wKgN19hhEYySnTEex4+kBuDTITvya61SpvxUUfu7fpGV+DXwM2CS
aJQdMOwl24BJ4gjev9JS5QasyZrAzZVuDwo8vSKG6PKZNGgC1uyjOrm+NjiTEW15
FzCk77rRHfiN6Zr9C79ZfqV/nWKm4rPvaJJOiNr2vZUQ/0bhbvSHp3/BekjtnlOv
W6GbUCoDT6/DU/p1SP2Yegqk5pOEcqKQFe7Uc3YDSfiNLNCp03nF1RuIoi/NzfDy
1GcLYWAvHCrtmpQwqM/gIgc9uAsFN9Stin2G79xt3U/dUitdAmwMsCfqDE1FO63e
pGjPx0H7e1Ot3en3O5agaAlYlsokptKl3bIVOMfK6s6bH3RK4Y83LxwsVQKYkayA
TyulczOPnx6i4+acQroIwpFTj8QhhNjjhBU5gXTebVj4B/CwfieZBadaYF23O765
shX71oUJ1gQ6LCZtu8brl/82uk3sSkpVDi8e5WWaSnLfnAmqtU/ITy5yg77uuD0b
RAdHxVFUO6Y0FspWmWzBckrPec7ub+SKglCACq8HNciGx/9BWx6NUWI9FK93CDIu
O36D/l9hoUvA0gds5Iom
=NVa4
-----END PGP SIGNATURE-----
Merge tag 'chrome-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform
Pull chrome platform updates from Olof Johansson:
"Here's a set of updates to the Chrome OS platform drivers for this
merge window.
Main new things this cycle is:
- Driver changes to expose the lightbar to users. With this, you can
make your own blinkenlights on Chromebook Pixels.
- Changes in the way that the atmel_mxt trackpads are probed. The
laptop driver is trying to be smart and not instantiate the devices
that don't answer to probe. For the trackpad that can come up in
two modes (bootloader or regular), this gets complicated since the
driver already knows how to handle the two modes including the
actual addresses used. So now the laptop driver needs to know more
too, instantiating the regular address even if the bootloader one
is the probe that passed.
- mfd driver improvements by Javier Martines Canillas, and a few
bugfixes from him, kbuild and myself"
* tag 'chrome-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/olof/chrome-platform:
platform/chrome: chromeos_laptop - instantiate Atmel at primary address
platform/chrome: cros_ec_lpc - Depend on X86 || COMPILE_TEST
platform/chrome: cros_ec_lpc - Include linux/io.h header file
platform/chrome: fix platform_no_drv_owner.cocci warnings
platform/chrome: cros_ec_lightbar - fix duplicate const warning
platform/chrome: cros_ec_dev - fix Unknown escape '%' warning
platform/chrome: Expose Chrome OS Lightbar to users
platform/chrome: Create sysfs attributes for the ChromeOS EC
mfd: cros_ec: Instantiate ChromeOS EC character device
platform/chrome: Add Chrome OS EC userspace device interface
platform/chrome: Add cros_ec_lpc driver for x86 devices
mfd: cros_ec: Add char dev and virtual dev pointers
mfd: cros_ec: Use fixed size arrays to transfer data with the EC
I_DIO_WAKEUP is never directly used, but fix it up anyway.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
do_blockdev_direct_IO() increments and decrements the inode
->i_dio_count for each IO operation. It does this to protect against
truncate of a file. Block devices don't need this sort of protection.
For a capable multiqueue setup, this atomic int is the only shared
state between applications accessing the device for O_DIRECT, and it
presents a scaling wall for that. In my testing, as much as 30% of
system time is spent incrementing and decrementing this value. A mixed
read/write workload improved from ~2.5M IOPS to ~9.6M IOPS, with
better latencies too. Before:
clat percentiles (usec):
| 1.00th=[ 33], 5.00th=[ 34], 10.00th=[ 34], 20.00th=[ 34],
| 30.00th=[ 34], 40.00th=[ 34], 50.00th=[ 35], 60.00th=[ 35],
| 70.00th=[ 35], 80.00th=[ 35], 90.00th=[ 37], 95.00th=[ 80],
| 99.00th=[ 98], 99.50th=[ 151], 99.90th=[ 155], 99.95th=[ 155],
| 99.99th=[ 165]
After:
clat percentiles (usec):
| 1.00th=[ 95], 5.00th=[ 108], 10.00th=[ 129], 20.00th=[ 149],
| 30.00th=[ 155], 40.00th=[ 161], 50.00th=[ 167], 60.00th=[ 171],
| 70.00th=[ 177], 80.00th=[ 185], 90.00th=[ 201], 95.00th=[ 270],
| 99.00th=[ 390], 99.50th=[ 398], 99.90th=[ 418], 99.95th=[ 422],
| 99.99th=[ 438]
In other setups, Robert Elliott reported seeing good performance
improvements:
https://lkml.org/lkml/2015/4/3/557
The more applications accessing the device, the worse it gets.
Add a new direct-io flags, DIO_SKIP_DIO_COUNT, which tells
do_blockdev_direct_IO() that it need not worry about incrementing
or decrementing the inode i_dio_count for this caller.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Elliott, Robert (Server Storage) <elliott@hp.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@fb.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Might not have an outdev yet. We'll oops when iface goes down while skbs
are still nfqueue'd:
RIP: 0010:[<ffffffff81422a2f>] [<ffffffff81422a2f>] dev_cmp+0x4f/0x80
nfqnl_rcv_dev_event+0xe2/0x150
notifier_call_chain+0x53/0xa0
Fixes: c737b7c451 ("netfilter: bridge: add helpers for fetching physin/outdev")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
- correction of copy-paste stupidity while doing the cleanup
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVOizVAAoJEAG+/NWsLn5bAqIP/2EP/eC34lYCdYkQ2rUnOJXr
vWSlyRi+RCjDR5y4hkjp6q8R0WAQSuOPcUZ8FRZjN5DO3pQF3ZEHnGbM/7O2SIeb
cthGcue5PIOzj55ftmefJCa5PArqK1XnQT09gEisvXKxjdTfYEpSyRV1WKmtI6N2
SrIKJaj5Kx18CV56+AvgsyDKZ7DuKeKCHMoRrTInfUr1gjifwZWppkSkH79LMkGP
p/dFbV2Ia2Zetc4k6j72jpIxMhJnPSXh8SpaeQ6EpRfTy1E1o9qCjruE2RfFk1wa
VF+uHNpjyQRVaJH4RwLVPqrDtZqHgB1LqRj2fm9wKhagrZ83G/h7MKnWGqIB9yGf
l6HM8ReEKTFOzj9LxRGdBltL/ZdInEUcBf8OgkIT3R4lm1RZc55Erfta7vOXl31v
2mkBYcAq4vOTBJkcIxS0TPLoHFJScwj/cdV16AHAgnR263/Ulh7RDwuEIMXEQ9C+
65Nwm0MNWS2UiGwXnMz/7UT1u5NdxhJ1j6e7iWaRBL6mh1MuzidiR5TLx6+Bdroo
KQ1N2qyf1PxXUR995JsWOMivkQWKPvO53jpaEdVKt4uTSvEuunvHembqtHinNfj4
ljCjO+Bc0ust8wlyilIUmh0hsPBOM3z66XxbTcCFp+gjcJW/L+VfBxnKlokhN9lH
Y6Q4uwY13gJXGZ806gDk
=ULmO
-----END PGP SIGNATURE-----
Merge tag 'dma-buf-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf
Pull dma-buf updates from Sumit Semwal:
"Minor cleanup only; this could've gone in for the 4.0 merge window,
but for a copy-paste stupidity from me.
It has been in the for-next since then, and no issues reported.
- cleanup of dma_buf_export()
- correction of copy-paste stupidity while doing the cleanup"
* tag 'dma-buf-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf:
staging: android: ion: fix wrong init of dma_buf_export_info
dma-buf: cleanup dma_buf_export() to make it easily extensible
Highlights:
- "experimental" code for managing md/raid1 across a cluster using
DLM. Code is not ready for general use and triggers a WARNING if used.
However it is looking good and mostly done and having in mainline
will help co-ordinate development.
- RAID5/6 can now batch multiple (4K wide) stripe_heads so as to
handle a full (chunk wide) stripe as a single unit.
- RAID6 can now perform read-modify-write cycles which should
help performance on larger arrays: 6 or more devices.
- RAID5/6 stripe cache now grows and shrinks dynamically. The value
set is used as a minimum.
- Resync is now allowed to go a little faster than the 'mininum' when
there is competing IO. How much faster depends on the speed of the
devices, so the effective minimum should scale with device speed to
some extent.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIVAwUAVTbIxDnsnt1WYoG5AQIgAA//Z9FlEpkHcJJ75WGjrXgGJPyNTfEZOkoz
jnD8PpBY2Afp341vMatd0XKErdEAhuPQmJMAa+tbxht6pk77X3fSzXghGA8FEafg
yazn5pfBt6NXmepV4vhl/+LNYuWRxxLSA9EDm7wEg+tiO0UEts+a0w2TSXzcT1w+
30Yi1EjAlTaJ5yjlBHUOtTXWE43D6RKnUr6FMy2dRlnzlFyDlezRMChWo9v6OkQF
YqJ20FcvmJdLHY/6Yif3jvpm7eQecMqdCZENTvW/mJI86zqf6E+ToCYS1VNjfDnK
ud61iU9eshu4WtNWBG6KLuHBD0grO1NaEL7/S16w1KdNJMhYgiK8WussvIAJEesA
5SlETM7Y/1XFq8puwlAq2/tuPfhZ+TFxnAwce/C3hMTDcYAACnS/R6INFQXqGvy3
nX1NLogrCycX8oqxv3jTFKLVqIVwlkSlHcUGzIWjcfCF37StcXFKI5q862agyg2+
NNocFMuXhPPM1YcB9JJSo2nCsor4e9tTdVEZlFm2B3cc8LJ9BLWUMSoi1h7VK/1g
P7psnPIjz7/cdI2TZTFjGTZ0Kvhx/NTYp41AZealDNxeGWUNM+5xGZnUF8QRBc/E
0dGHtEAah834BDQFvNnJtuuh/s+KwbvswjNP+njoBsHjIQIvngDABpOwpIkdqF6r
diQ2gUPnHN0=
=OHG6
-----END PGP SIGNATURE-----
Merge tag 'md/4.1' of git://neil.brown.name/md
Pull md updates from Neil Brown:
"More updates that usual this time. A few have performance impacts
which hould mostly be positive, but RAID5 (in particular) can be very
work-load ensitive... We'll have to wait and see.
Highlights:
- "experimental" code for managing md/raid1 across a cluster using
DLM. Code is not ready for general use and triggers a WARNING if
used. However it is looking good and mostly done and having in
mainline will help co-ordinate development.
- RAID5/6 can now batch multiple (4K wide) stripe_heads so as to
handle a full (chunk wide) stripe as a single unit.
- RAID6 can now perform read-modify-write cycles which should help
performance on larger arrays: 6 or more devices.
- RAID5/6 stripe cache now grows and shrinks dynamically. The value
set is used as a minimum.
- Resync is now allowed to go a little faster than the 'mininum' when
there is competing IO. How much faster depends on the speed of the
devices, so the effective minimum should scale with device speed to
some extent"
* tag 'md/4.1' of git://neil.brown.name/md: (58 commits)
md/raid5: don't do chunk aligned read on degraded array.
md/raid5: allow the stripe_cache to grow and shrink.
md/raid5: change ->inactive_blocked to a bit-flag.
md/raid5: move max_nr_stripes management into grow_one_stripe and drop_one_stripe
md/raid5: pass gfp_t arg to grow_one_stripe()
md/raid5: introduce configuration option rmw_level
md/raid5: activate raid6 rmw feature
md/raid6 algorithms: xor_syndrome() for SSE2
md/raid6 algorithms: xor_syndrome() for generic int
md/raid6 algorithms: improve test program
md/raid6 algorithms: delta syndrome functions
raid5: handle expansion/resync case with stripe batching
raid5: handle io error of batch list
RAID5: batch adjacent full stripe write
raid5: track overwrite disk count
raid5: add a new flag to track if a stripe can be batched
raid5: use flex_array for scribble data
md raid0: access mddev->queue (request queue member) conditionally because it is not set when accessed from dm-raid
md: allow resync to go faster when there is competing IO.
md: remove 'go_faster' option from ->sync_request()
...
- DT endianness specification bindings
- Big endian 8250 serial support
- DT overlay unittest updates
- Various DT doc updates
- Compile fixes for OF_IRQ=n
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVOG5kAAoJEMhvYp4jgsXihDgH/3pmPSjuRG1bhGssmnchHjWh
SU2eS2MZnlD60UqRt7jd3smCX2qL83tfwpFhOvCT9Mz775E7ggmYq9fS8pCYAbaD
x98mUrE2GzdUzlrL6RS8Z0ExjyGwbMoW3+cZtyPkmC6CsW0fwqEPmEyk7m+Hk8C3
w3pWG06o+G8UjiFmwbr8Pki2ykxvucr22NCzH4SS6bAD4QOrQO3v48QkUg7XFlVc
NHNzQbswL85uOJ7uuAbxg+s8TXkwcxUeMJEKldLrjuyppO3N1MjnOgCptnhVNOOb
zK+IsS378jMiNjAg2ui/BLH60N5yadkgk4+L4iPPy+y/yR61NCVXxRe11IQJxb0=
=rtv6
-----END PGP SIGNATURE-----
Merge tag 'devicetree-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull second batch of devicetree updates from Rob Herring:
"As Grant mentioned in the first devicetree pull request, here is the
2nd batch of DT changes for 4.1. The main remaining item here is the
endianness bindings and related 8250 driver support.
- DT endianness specification bindings
- big-endian 8250 serial support
- DT overlay unittest updates
- various DT doc updates
- compile fixes for OF_IRQ=n"
* tag 'devicetree-for-4.1' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
frv: add io{read,write}{16,32}be functions
mn10300: add io{read,write}{16,32}be functions
Documentation: DT bindings: add doc for Altera's SoCFPGA platform
of: base: improve of_get_next_child() kernel-doc
Doc: dt: arch_timer: discourage clock-frequency use
of: unittest: overlay: Keep track of created overlays
of/fdt: fix allocation size for device node path
serial: of_serial: Support big-endian register accesses
serial: 8250: Add support for big-endian MMIO accesses
of: Document {little,big,native}-endian bindings
of/fdt: Add endianness helper function for early init code
of: Add helper function to check MMIO register endianness
of/fdt: Remove "reg" data prints from early_init_dt_scan_memory
of: add vendor prefix for Artesyn
of: Add dummy of_irq_to_resource_table() for IRQ_OF=n
of: OF_IRQ should depend on IRQ_DOMAIN