NFSv4.2 client, and cleaning up some convoluted backchannel server code
in the process. Otherwise, miscellaneous smaller bugfixes and cleanup.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcLR3nAAoJECebzXlCjuG+oyAQALrPSTH9Qg2AwP2eGm+AevUj
u/VFmimImIO9dYuT02t4w42w4qMIQ0/Y7R0UjT3DxG5Oixy/zA+ZaNXCCEKwSMIX
abGF4YalUISbDc6n0Z8J14/T33wDGslhy3IQ9Jz5aBCDCocbWlzXvFlmrowbb3ak
vtB0Fc3Xo6Z/Pu2GzNzlqR+f69IAmwQGJrRrAEp3JUWSIBKiSWBXTujDuVBqJNYj
ySLzbzyAc7qJfI76K635XziULR2ueM3y5JbPX7kTZ0l3OJ6Yc0PtOj16sIv5o0XK
DBYPrtvw3ZbxQE/bXqtJV9Zn6MG5ODGxKszG1zT1J3dzotc9l/LgmcAY8xVSaO+H
QNMdU9QuwmyUG20A9rMoo/XfUb5KZBHzH7HIYOmkfBidcaygwIInIKoIDtzimm4X
OlYq3TL/3QDY6rgTCZv6n2KEnwiIDpc5+TvFhXRWclMOJMcMSHJfKFvqERSv9V3o
90qrCebPA0K8Dnc0HMxcBXZ+0TqZ2QeXp/wfIjibCXqMwlg+BZhmbeA0ngZ7x7qf
2F33E9bfVJjL+VI5FcVYQf43bOTWZgD6ZKGk4T7keYl0CPH+9P70bfhl4KKy9dqc
GwYooy/y5FPb2CvJn/EETeILRJ9OyIHUrw7HBkpz9N8n9z+V6Qbp9yW7LKgaMphW
1T+GpHZhQjwuBPuJhDK0
=dRLp
-----END PGP SIGNATURE-----
Merge tag 'nfsd-4.21' of git://linux-nfs.org/~bfields/linux
Pull nfsd updates from Bruce Fields:
"Thanks to Vasily Averin for fixing a use-after-free in the
containerized NFSv4.2 client, and cleaning up some convoluted
backchannel server code in the process.
Otherwise, miscellaneous smaller bugfixes and cleanup"
* tag 'nfsd-4.21' of git://linux-nfs.org/~bfields/linux: (25 commits)
nfs: fixed broken compilation in nfs_callback_up_net()
nfs: minor typo in nfs4_callback_up_net()
sunrpc: fix debug message in svc_create_xprt()
sunrpc: make visible processing error in bc_svc_process()
sunrpc: remove unused xpo_prep_reply_hdr callback
sunrpc: remove svc_rdma_bc_class
sunrpc: remove svc_tcp_bc_class
sunrpc: remove unused bc_up operation from rpc_xprt_ops
sunrpc: replace svc_serv->sv_bc_xprt by boolean flag
sunrpc: use-after-free in svc_process_common()
sunrpc: use SVC_NET() in svcauth_gss_* functions
nfsd: drop useless LIST_HEAD
lockd: Show pid of lockd for remote locks
NFSD remove OP_CACHEME from 4.2 op_flags
nfsd: Return EPERM, not EACCES, in some SETATTR cases
sunrpc: fix cache_head leak due to queued request
nfsd: clean up indentation, increase indentation in switch statement
svcrdma: Optimize the logic that selects the R_key to invalidate
nfsd: fix a warning in __cld_pipe_upcall()
nfsd4: fix crash on writing v4_end_grace before nfsd startup
...
If a reply has been processed but the RPC is later retransmitted
anyway, the req->rl_reply field still contains the only pointer to
the old rpcrdma rep. When the next reply comes in, the reply handler
will stomp on the rl_reply field, leaking the old rep.
A trace event is added to capture such leaks.
This problem seems to be worsened by the restructuring of the RPC
Call path in v4.20. Fully addressing this issue will require at
least a re-architecture of the disconnect logic, which is not
appropriate during -rc.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Clean up, no functional change is expected.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
These are rare, but can be helpful at tracking down DMAR and other
problems.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Name them "trace_xprtrdma_op_*" so they can be easily enabled as a
group. No trace point is added where the generic layer already has
observability.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
The chunk-related trace points capture nearly the same information
as the MR-related trace points.
Also, rename them so globbing can be used to enable or disable
these trace points more easily.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Clean up: Divide the work cleanly:
- rpcrdma_wc_receive is responsible only for RDMA Receives
- rpcrdma_reply_handler is responsible only for RPC Replies
- the posted send and receive counts both belong in rpcrdma_ep
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas. Additionally, we have
a pile of annotation, unused variable and minor updates. The big API
change is the updates for Christoph's DMA rework which include
removing the DISABLE_CLUSTERING flag. And finally there are a couple
of target tree updates.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXCEUNiYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishdjKAP9vrTTv
qFaYmAoRSbPq9ZiixaXLMy0K/6o76Uay0gnBqgD/fgn3jg/KQ6alNaCjmfeV3wAj
u1j3H7tha9j1it+4pUw=
=GDa+
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This is mostly update of the usual drivers: smarpqi, lpfc, qedi,
megaraid_sas, libsas, zfcp, mpt3sas, hisi_sas.
Additionally, we have a pile of annotation, unused variable and minor
updates.
The big API change is the updates for Christoph's DMA rework which
include removing the DISABLE_CLUSTERING flag.
And finally there are a couple of target tree updates"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (259 commits)
scsi: isci: request: mark expected switch fall-through
scsi: isci: remote_node_context: mark expected switch fall-throughs
scsi: isci: remote_device: Mark expected switch fall-throughs
scsi: isci: phy: Mark expected switch fall-through
scsi: iscsi: Capture iscsi debug messages using tracepoints
scsi: myrb: Mark expected switch fall-throughs
scsi: megaraid: fix out-of-bound array accesses
scsi: mpt3sas: mpt3sas_scsih: Mark expected switch fall-through
scsi: fcoe: remove set but not used variable 'port'
scsi: smartpqi: call pqi_free_interrupts() in pqi_shutdown()
scsi: smartpqi: fix build warnings
scsi: smartpqi: update driver version
scsi: smartpqi: add ofa support
scsi: smartpqi: increase fw status register read timeout
scsi: smartpqi: bump driver version
scsi: smartpqi: add smp_utils support
scsi: smartpqi: correct lun reset issues
scsi: smartpqi: correct volume status
scsi: smartpqi: do not offline disks for transient did no connect conditions
scsi: smartpqi: allow for larger raid maps
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlwb7R8QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpjiID/97oDjMhNT7rwpuMbHw855h62j1hEN/m+N3
FI0uxivYoYZLD+eJRnMcBwHlKjrCX8iJQAcv9ffI3ThtFW7dnZT3atUacaZVR/Dt
IrxdymdBP3qsmuaId5NYBug7rJ+AiqFJKjEvCcSPu5X397J4I3SEbzhfvYLJ/aZX
16o0HJlVVIrcbmq1IP4HwiIIOaKXvPaw04L4z4fpeynRSWG7EAi8NLSnhlR4Rxbb
BTiMkCTsjRCFdyO6da4fvNQKWmPGPa3bJkYy3qR99cvJCeIbQjRyCloQlWNJRRgi
3eJpCHVxqFmN0/+DNTJVQEEr4H8o0AVucrLVct1Jc4pessenkpoUniP8vELqwlng
Z2VHLkhTfCEmvFlk82grrYdNvGATRsrbswt/PlP4T7rBfr1IpDk8kXDWF59EL2dy
ly35Sk3wJGHBl8qa+vEPXOAnaWdqJXuVGpwB4ifOIatOls8mOxwfZjiRc7x05/fC
1O4rR2IfLwRqwoYHs0AJ+h6ohOSn1mkGezl2Tch1VSFcJUOHmuYvraTaUi6hblpA
SslaAoEhO39hRBL0HsvsMeqVWM9uzqvFkLDCfNPdiA81H1258CIbo4vF8z6czCIS
eeXnTJxVhPVbZgb3a1a93SPwM6KIDZFoIijyd+NqjpU94thlnhYD0QEcKJIKH7os
2p4aHs6ktw==
=TRdW
-----END PGP SIGNATURE-----
Merge tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-block
Pull block updates from Jens Axboe:
"This is the main pull request for block/storage for 4.21.
Larger than usual, it was a busy round with lots of goodies queued up.
Most notable is the removal of the old IO stack, which has been a long
time coming. No new features for a while, everything coming in this
week has all been fixes for things that were previously merged.
This contains:
- Use atomic counters instead of semaphores for mtip32xx (Arnd)
- Cleanup of the mtip32xx request setup (Christoph)
- Fix for circular locking dependency in loop (Jan, Tetsuo)
- bcache (Coly, Guoju, Shenghui)
* Optimizations for writeback caching
* Various fixes and improvements
- nvme (Chaitanya, Christoph, Sagi, Jay, me, Keith)
* host and target support for NVMe over TCP
* Error log page support
* Support for separate read/write/poll queues
* Much improved polling
* discard OOM fallback
* Tracepoint improvements
- lightnvm (Hans, Hua, Igor, Matias, Javier)
* Igor added packed metadata to pblk. Now drives without metadata
per LBA can be used as well.
* Fix from Geert on uninitialized value on chunk metadata reads.
* Fixes from Hans and Javier to pblk recovery and write path.
* Fix from Hua Su to fix a race condition in the pblk recovery
code.
* Scan optimization added to pblk recovery from Zhoujie.
* Small geometry cleanup from me.
- Conversion of the last few drivers that used the legacy path to
blk-mq (me)
- Removal of legacy IO path in SCSI (me, Christoph)
- Removal of legacy IO stack and schedulers (me)
- Support for much better polling, now without interrupts at all.
blk-mq adds support for multiple queue maps, which enables us to
have a map per type. This in turn enables nvme to have separate
completion queues for polling, which can then be interrupt-less.
Also means we're ready for async polled IO, which is hopefully
coming in the next release.
- Killing of (now) unused block exports (Christoph)
- Unification of the blk-rq-qos and blk-wbt wait handling (Josef)
- Support for zoned testing with null_blk (Masato)
- sx8 conversion to per-host tag sets (Christoph)
- IO priority improvements (Damien)
- mq-deadline zoned fix (Damien)
- Ref count blkcg series (Dennis)
- Lots of blk-mq improvements and speedups (me)
- sbitmap scalability improvements (me)
- Make core inflight IO accounting per-cpu (Mikulas)
- Export timeout setting in sysfs (Weiping)
- Cleanup the direct issue path (Jianchao)
- Export blk-wbt internals in block debugfs for easier debugging
(Ming)
- Lots of other fixes and improvements"
* tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-block: (364 commits)
kyber: use sbitmap add_wait_queue/list_del wait helpers
sbitmap: add helpers for add/del wait queue handling
block: save irq state in blkg_lookup_create()
dm: don't reuse bio for flushes
nvme-pci: trace SQ status on completions
nvme-rdma: implement polling queue map
nvme-fabrics: allow user to pass in nr_poll_queues
nvme-fabrics: allow nvmf_connect_io_queue to poll
nvme-core: optionally poll sync commands
block: make request_to_qc_t public
nvme-tcp: fix spelling mistake "attepmpt" -> "attempt"
nvme-tcp: fix endianess annotations
nvmet-tcp: fix endianess annotations
nvme-pci: refactor nvme_poll_irqdisable to make sparse happy
nvme-pci: only set nr_maps to 2 if poll queues are supported
nvmet: use a macro for default error location
nvmet: fix comparison of a u16 with -1
blk-mq: enable IO poll if .nr_queues of type poll > 0
blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()
blk-mq: skip zero-queue maps in blk_mq_map_swqueue
...
if node have NFSv41+ mounts inside several net namespaces
it can lead to use-after-free in svc_process_common()
svc_process_common()
/* Setup reply header */
rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr(rqstp); <<< HERE
svc_process_common() can use incorrect rqstp->rq_xprt,
its caller function bc_svc_process() takes it from serv->sv_bc_xprt.
The problem is that serv is global structure but sv_bc_xprt
is assigned per-netnamespace.
According to Trond, the whole "let's set up rqstp->rq_xprt
for the back channel" is nothing but a giant hack in order
to work around the fact that svc_process_common() uses it
to find the xpt_ops, and perform a couple of (meaningless
for the back channel) tests of xpt_flags.
All we really need in svc_process_common() is to be able to run
rqstp->rq_xprt->xpt_ops->xpo_prep_reply_hdr()
Bruce J Fields points that this xpo_prep_reply_hdr() call
is an awfully roundabout way just to do "svc_putnl(resv, 0);"
in the tcp case.
This patch does not initialiuze rqstp->rq_xprt in bc_svc_process(),
now it calls svc_process_common() with rqstp->rq_xprt = NULL.
To adjust reply header svc_process_common() just check
rqstp->rq_prot and calls svc_tcp_prep_reply_hdr() for tcp case.
To handle rqstp->rq_xprt = NULL case in functions called from
svc_process_common() patch intruduces net namespace pointer
svc_rqst->rq_bc_net and adjust SVC_NET() definition.
Some other function was also adopted to properly handle described case.
Signed-off-by: Vasily Averin <vvs@virtuozzo.com>
Cc: stable@vger.kernel.org
Fixes: 23c20ecd44 ("NFS: callback up - users counting cleanup")
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJcILryAAoJEAAOaEEZVoIVByMP/RYty0dsV9ALt0PKqxKyDVT7
9KOGA73JIahpeO2dnlIiZAXdeC3Y350UH2axrX6Xy/X6ktTCca3X/xqZMn/wK5kt
zeMeqcjsyTz4wJoOCWUm7nsteMALfusDAIMd8axEnBFgWk9nsQwgh+jS1gYVj2D1
GUSdCceKcQ0ZOSsZDzDFgd/R34dNMobvYd1aOE2bgHL19BAXj1aKIv81yAjjY41A
D+VVvEyzIHQUtxmWk1X3X8kYfhHMu4X5AQhhqPxw8jw8CC0w0lfQTOZe/ApeMfxZ
BFhusMdwf8QhkGWMOfhOTldTm3GobBmdsNX5HmukvcAEjDVPiOLCiKXaLWEBJ68r
HbmB3YyxzT2re0PfSa72WIu6W8aKZHUny+BLgTHiuW3KIV1khJK4AflWKMLfe8yi
0xUWm0SeiwMCdBLtkD8IykC19LBCAKM15JBxpUHadBvBsO9G+54DfzRImjnxSjAH
tX0RnFWmA7hXt02fZjcywT+/+n84kt0lbjdJAKXK6tl0DbGxlEj8YoJVBNG/p94q
ARHgBKMFTrsDPJiMY5WSobHapjmnSNDBk6uSf6TPe2E5wmnhLcwfTqwCdnpocW6c
A7CX+q/Jfyk1oNEGajXXyIsjaDc2Ma0Fs/mfb3o5Lhkjq68e+LT0+eIjn4veuY5H
KoZ/KVBGs4Uxy81qMPkl
=sUSp
-----END PGP SIGNATURE-----
Merge tag 'locks-v4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull file locking updates from Jeff Layton:
"The main change in this set is Neil Brown's work to reduce the
thundering herd problem when a heavily-contended file lock is
released.
Previously we'd always wake up all waiters when this occurred. With
this set, we'll now we only wake up waiters that were blocked on the
range being released"
* tag 'locks-v4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
locks: Use inode_is_open_for_write
fs/locks: remove unnecessary white space.
fs/locks: merge posix_unblock_lock() and locks_delete_block()
fs/locks: create a tree of dependent requests.
fs/locks: change all *_conflict() functions to return bool.
fs/locks: always delete_block after waiting.
fs/locks: allow a lock request to block other requests.
fs/locks: use properly initialized file_lock when unlocking.
ocfs2: properly initial file_lock used for unlock.
gfs2: properly initial file_lock used for unlock.
NFS: use locks_copy_lock() to copy locks.
fs/locks: split out __locks_wake_up_blocks().
fs/locks: rename some lists and pointers.
in ext4's NFS support, and fix an ioctl (EXT4_IOC_GROUP_ADD) used by
old versions of e2fsprogs which we accidentally broke a while back.
Also fixed some error paths in ext4's quota and inline data support.
Finally, improve tail latency in jbd2's commit code.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlwgYYEACgkQ8vlZVpUN
gaPh1Af9GCGcgbwmmE+PSwqYXXdDm27hG3Xv4glLcAnmmuT32lMjzjifWPhT8sFs
+l1nh2rd0/u14PocjAL8neuuc9G9J+xNS33jlNQsqMsfFQD4cZMk7T1j68JIAEd3
bt/VNIUxvPshYwgvEJlXAeZvXx8kPMKyR44/FyzHdU9oDSWBYE3A9+rjRGUFxXDR
LuqwLhvERv6Vykfrzhluj8IOZM6V221alRDuWjx1sQF+/E6zAqyjR3YoYXk04Ajg
vnAfEXToeBwLVeTUQgmT9hPrinh7/00wCKekNuzzhg7oKDp7FgD1BMlxBr9eOW+5
pQwM9T+AVbs9EfpYasC6ElEMbLfOPw==
=lCnm
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
"All cleanups and bug fixes; most notably, fix some problems discovered
in ext4's NFS support, and fix an ioctl (EXT4_IOC_GROUP_ADD) used by
old versions of e2fsprogs which we accidentally broke a while back.
Also fixed some error paths in ext4's quota and inline data support.
Finally, improve tail latency in jbd2's commit code"
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4:
ext4: check for shutdown and r/o file system in ext4_write_inode()
ext4: force inode writes when nfsd calls commit_metadata()
ext4: avoid declaring fs inconsistent due to invalid file handles
ext4: include terminating u32 in size of xattr entries when expanding inodes
ext4: compare old and new mode before setting update_mode flag
ext4: fix EXT4_IOC_GROUP_ADD ioctl
ext4: hard fail dax mount on unsupported devices
jbd2: update locking documentation for transaction_t
ext4: remove redundant condition check
jbd2: clean up indentation issue, replace spaces with tab
ext4: clean up indentation issues, remove extraneous tabs
ext4: missing unlock/put_page() in ext4_try_to_write_inline_data()
ext4: fix possible use after free in ext4_quota_enable
jbd2: avoid long hold times of j_state_lock while committing a transaction
ext4: add ext4_sb_bread() to disambiguate ENOMEM cases
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlwaO6gACgkQxWXV+ddt
WDuiuRAAktVw4AEogVKW8J9Fx7UCW8hJ0C2NFWSlcvaNfKi7ccMBsINqQIH6bvmv
lO2KUuaKRK+FID3R3Er9Z4GZmvg7nXAEp/+PQgYMYYOFOmsdr/YCZCjBn9ACDZ/K
XJD/ESpEq9FORMj1yHre3zKgCPwkOK66Z8XFRMPaC5ANiLNohuE25FkwTTva0Bse
TyOc8aSKmfXBqoNvlKS0qc52FUoS6aebrYfTA7l9FfmUJ0szZeHSCtRyY4EzlQu7
VLojNq7GkSXpGGQHTTSp73Bj1E2ZM6fwTgpC1NHAFnTCTFSr/GswdfMdmxMkNuqm
zsrioqIzh6kX8RXVvB81Vb1aCRXELhcpBDWK/5QbEPyQvfKxQJLoHBt3QqewwtnT
lClhNi4tx3ItfWddz24kq/E8aaIOQtECq6OQXpyRPc47TmKwY8m8V41LP/ECNBQ2
InIuAsk0abkQvemc98SNxDNV0hiMJHJnR93wY8MEs6JTY3PuVE6ELrMTB7GRsl6M
zHHFD1kSBVNMxWr+pwnDe5NxQ3qP4SB1RZvTlN7FdzLAhWIFEN7foHoXI9h2G0kx
+H/cVr2OiXdxUd19IuAF9kxVkfrJ7A3JEopVdaI57swyu/IK2xGsPd5nDkf4kOXU
OpcnnbDxg6v0NL/Vr/LxWSa6qgVqM2D4JyxjMTMPQOhK7Y6GP4M=
=rO1q
-----END PGP SIGNATURE-----
Merge tag 'for-4.21-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"New features:
- swapfile support - after a long time it's here, with some
limitations where COW design does not work well with the swap
implementation (nodatacow file, no compression, cannot be
snapshotted, not possible on multiple devices, ...), as this is the
most restricted but working setup, we'll try to improve that in the
future
- metadata uuid - an optional incompat feature to assign a new
filesystem UUID without overwriting all metadata blocks, stored
only in superblock
- more balance messages are printed to system log, initial is in the
format of the command line that would be used to start it
Fixes:
- tag pages of a snapshot to better separate pages that are involved
in the snapshot (and need to get synced) from newly dirtied pages
that could slow down or even livelock the snapshot operation
- improved check of filesystem id associated with a device during
scan to detect duplicate devices that could be mixed up during
mount
- fix device replace state transitions, eg. when it ends up
interrupted and reboot tries to restart balance too, or when
start/cancel ioctls race
- fix a crash due to a race when quotas are enabled during snapshot
creation
- GFP_NOFS/memalloc_nofs_* fixes due to GFP_KERNEL allocations in
transaction context
- fix fsync of files with multiple hard links in new directories
- fix race of send with transaction commits that create snapshots
Core changes:
- cleanups:
* further removals of now-dead fsync code
* core function for finding free extent has been split and
provides a base for further cleanups to make the logic more
understandable
* removed lot of indirect callbacks for data and metadata inodes
* simplified refcounting and locking for cloned extent buffers
* removed redundant function arguments
* defines converted to enums where appropriate
- separate reserve for delayed refs from global reserve, update logic
to do less trickery and ad-hoc heuristics, move out some related
expensive operations from transaction commit or file truncate
- dev-replace switched from custom locking scheme to semaphore
- remove first phase of balance that tried to make some space for the
relocation by calling shrink and grow, this did not work as
expected and only introduced more error states due to potential
resize failures, slightly improves the runtime as the chunks on all
devices are not needlessly enumerated
- clone and deduplication now use generic helper that adds a few more
checks that were missing from the original btrfs implementation of
the ioctls"
* tag 'for-4.21-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (125 commits)
btrfs: Fix typos in comments and strings
btrfs: improve error handling of btrfs_add_link
Btrfs: use generic_remap_file_range_prep() for cloning and deduplication
btrfs: Refactor main loop in extent_readpages
btrfs: Remove 1st shrink/grow phase from balance
Btrfs: send, fix race with transaction commits that create snapshots
Btrfs: use nofs context when initializing security xattrs to avoid deadlock
btrfs: run delayed items before dropping the snapshot
btrfs: catch cow on deleting snapshots
btrfs: extent-tree: cleanup one-shot usage of @blocksize in do_walk_down
Btrfs: scrub, move setup of nofs contexts higher in the stack
btrfs: scrub: move scrub_setup_ctx allocation out of device_list_mutex
btrfs: scrub: pass fs_info to scrub_setup_ctx
btrfs: fix truncate throttling
btrfs: don't run delayed refs in the end transaction logic
btrfs: rework btrfs_check_space_for_delayed_refs
btrfs: add new flushing states for the delayed refs rsv
btrfs: update may_commit_transaction to use the delayed refs rsv
btrfs: introduce delayed_refs_rsv
btrfs: only track ref_heads in delayed_ref_updates
...
This commit enhances iscsi initiator modules to capture iscsi debug
messages using linux kernel tracepoint facility:
https://www.kernel.org/doc/Documentation/trace/tracepoints.txt
The following tracepoint events have been created under the iscsi
tracepoint event group:
iscsi_dbg_conn - to capture connection debug messages (libiscsi module)
iscsi_dbg_session - to capture session debug messages (libiscsi module)
iscsi_dbg_eh - to capture error handling debug messages (libiscsi module)
iscsi_dbg_tcp - to capture iscsi tcp debug messages (libiscsi_tcp module)
iscsi_dbg_sw_tcp - to capture iscsi sw tcp debug messages (iscsi_tcp module)
iscsi_dbg_trans_session - to cpature iscsi transsport sess debug messages
(scsi_transport_iscsi module)
iscsi_dbg_trans_conn - to capture iscsi transport conn debug messages
(scsi_transport_iscsi module)
[mkp: typos]
Signed-off-by: Fred Herard <fred.herard@oracle.com>
Reviewed-by: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com>
Reviewed-by: Lee Duncan <lduncan@suse.com>
Reviewed-by: Douglas Gilbert <dgilbert@interlog.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Some time back, nfsd switched from calling vfs_fsync() to using a new
commit_metadata() hook in export_operations(). If the file system did
not provide a commit_metadata() hook, it fell back to using
sync_inode_metadata(). Unfortunately doesn't work on all file
systems. In particular, it doesn't work on ext4 due to how the inode
gets journalled --- the VFS writeback code will not always call
ext4_write_inode().
So we need to provide our own ext4_nfs_commit_metdata() method which
calls ext4_write_inode() directly.
Google-Bug-Id: 121195940
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
A nice thing we gain with the delayed refs rsv is the ability to flush
the delayed refs on demand to deal with enospc pressure. Add states to
flush delayed refs on demand, and this will allow us to remove a lot of
ad-hoc work around checking to see if we should commit the transaction
to run our delayed refs.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently btrfs_fs_info structure contains a copy of the
fsid/metadata_uuid fields. Same values are also contained in the
btrfs_fs_devices structure which fs_info has a reference to. Let's
reduce duplication by removing the fields from fs_info and always refer
to the ones in fs_devices. No functional changes.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Sometimes flush journal may be very frequent, so it's useful to dump
number of keys every time write journal.
Signed-off-by: Guoju Fang <fangguoju@gmail.com>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Several conflicts, seemingly all over the place.
I used Stephen Rothwell's sample resolutions for many of these, if not
just to double check my own work, so definitely the credit largely
goes to him.
The NFP conflict consisted of a bug fix (moving operations
past the rhashtable operation) while chaning the initial
argument in the function call in the moved code.
The net/dsa/master.c conflict had to do with a bug fix intermixing of
making dsa_master_set_mtu() static with the fixing of the tagging
attribute location.
cls_flower had a conflict because the dup reject fix from Or
overlapped with the addition of port range classifiction.
__set_phy_supported()'s conflict was relatively easy to resolve
because Andrew fixed it in both trees, so it was just a matter
of taking the net-next copy. Or at least I think it was :-)
Joe Stringer's fix to the handling of netns id 0 in bpf_sk_lookup()
intermixed with changes on how the sdif and caller_net are calculated
in these code paths in net-next.
The remaining BPF conflicts were largely about the addition of the
__bpf_md_ptr stuff in 'net' overlapping with adjustments and additions
to the relevant data structure where the MD pointer macros are used.
Signed-off-by: David S. Miller <davem@davemloft.net>
Trace events are already present for the receive entry points, to indicate
how the reception entered the stack.
This patch adds the corresponding exit trace events that will bound the
reception such that all events occurring between the entry and the exit
can be considered as part of the reception context. This greatly helps
for dependency and root cause analyses.
Without this, it is not possible with tracepoint instrumentation to
determine whether a sched_wakeup event following a netif_receive_skb
event is the result of the packet reception or a simple coincidence after
further processing by the thread. It is possible using other mechanisms
like kretprobes, but considering the "entry" points are already present,
it would be good to add the matching exit events.
In addition to linking packets with wakeups, the entry/exit event pair
can also be used to perform network stack latency analyses.
Signed-off-by: Geneviève Bastien <gbastien@versatic.net>
CC: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
CC: Steven Rostedt <rostedt@goodmis.org>
CC: Ingo Molnar <mingo@redhat.com>
CC: David S. Miller <davem@davemloft.net>
Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> (tracing side)
Signed-off-by: David S. Miller <davem@davemloft.net>
was introduced by a patch that tried to fix one bug, but by doing so created
another bug. As both bugs corrupt the output (but they do not crash the
kernel), I decided to fix the design such that it could have both bugs
fixed. The original fix, fixed time reporting of the function graph tracer
when doing a max_depth of one. This was code that can test how much the
kernel interferes with userspace. But in doing so, it could corrupt the time
keeping of the function profiler.
The issue is that the curr_ret_stack variable was being used for two
different meanings. One was to keep track of the stack pointer on the
ret_stack (shadow stack used by the function graph tracer), and the other
use case was the graph call depth. Although, the two may be closely
related, where they got updated was the issue that lead to the two different
bugs that required the two use cases to be updated differently.
The big issue with this fix is that it requires changing each architecture.
The good news is, I was able to remove a lot of code that was duplicated
within the architectures and place it into a single location. Then I could
make the fix in one place.
I pushed this code into linux-next to let it settle over a week, and before
doing so, I cross compiled all the affected architectures to make sure that
they built fine.
In the mean time, I also pulled in a patch that fixes the sched_switch
previous tasks state output, that was not actually correct.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCW/4NPhQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qnWAAQCyUIRLgYImr81eTl52lxNRsULk+aiI
U29kRFWWU0c40AEA1X9sDF0MgOItbRGfZtnHTZEousXRDaDf4Fge2kF7Egg=
=liQ0
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"While rewriting the function graph tracer, I discovered a design flaw
that was introduced by a patch that tried to fix one bug, but by doing
so created another bug.
As both bugs corrupt the output (but they do not crash the kernel), I
decided to fix the design such that it could have both bugs fixed. The
original fix, fixed time reporting of the function graph tracer when
doing a max_depth of one. This was code that can test how much the
kernel interferes with userspace. But in doing so, it could corrupt
the time keeping of the function profiler.
The issue is that the curr_ret_stack variable was being used for two
different meanings. One was to keep track of the stack pointer on the
ret_stack (shadow stack used by the function graph tracer), and the
other use case was the graph call depth. Although, the two may be
closely related, where they got updated was the issue that lead to the
two different bugs that required the two use cases to be updated
differently.
The big issue with this fix is that it requires changing each
architecture. The good news is, I was able to remove a lot of code
that was duplicated within the architectures and place it into a
single location. Then I could make the fix in one place.
I pushed this code into linux-next to let it settle over a week, and
before doing so, I cross compiled all the affected architectures to
make sure that they built fine.
In the mean time, I also pulled in a patch that fixes the sched_switch
previous tasks state output, that was not actually correct"
* tag 'trace-v4.20-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
sched, trace: Fix prev_state output in sched_switch tracepoint
function_graph: Have profiler use curr_ret_stack and not depth
function_graph: Reverse the order of pushing the ret_stack and the callback
function_graph: Move return callback before update of curr_ret_stack
function_graph: Use new curr_ret_depth to manage depth instead of curr_ret_stack
function_graph: Make ftrace_push_return_trace() static
sparc/function_graph: Simplify with function_graph_enter()
sh/function_graph: Simplify with function_graph_enter()
s390/function_graph: Simplify with function_graph_enter()
riscv/function_graph: Simplify with function_graph_enter()
powerpc/function_graph: Simplify with function_graph_enter()
parisc: function_graph: Simplify with function_graph_enter()
nds32: function_graph: Simplify with function_graph_enter()
MIPS: function_graph: Simplify with function_graph_enter()
microblaze: function_graph: Simplify with function_graph_enter()
arm64: function_graph: Simplify with function_graph_enter()
ARM: function_graph: Simplify with function_graph_enter()
x86/function_graph: Simplify with function_graph_enter()
function_graph: Create function_graph_enter() to consolidate architecture code
struct file lock contains an 'fl_next' pointer which
is used to point to the lock that this request is blocked
waiting for. So rename it to fl_blocker.
The fl_blocked list_head in an active lock is the head of a list of
blocked requests. In a request it is a node in that list.
These are two distinct uses, so replace with two list_heads
with different names.
fl_blocked_requests is the head of a list of blocked requests
fl_blocked_member is a node in a member of that list.
The two different list_heads are never used at the same time, but that
will change in a future patch.
Note that a tracepoint is changed to report fl_blocker instead
of fl_next.
Signed-off-by: NeilBrown <neilb@suse.com>
Reviewed-by: J. Bruce Fields <bfields@redhat.com>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
commit 3f5fe9fef5 ("sched/debug: Fix task state recording/printout")
tried to fix the problem introduced by a previous commit efb40f588b
("sched/tracing: Fix trace_sched_switch task-state printing"). However
the prev_state output in sched_switch is still broken.
task_state_index() uses fls() which considers the LSB as 1. Left
shifting 1 by this value gives an incorrect mapping to the task state.
Fix this by decrementing the value returned by __get_task_state()
before shifting.
Link: http://lkml.kernel.org/r/1540882473-1103-1-git-send-email-pkondeti@codeaurora.org
Cc: stable@vger.kernel.org
Fixes: 3f5fe9fef5 ("sched/debug: Fix task state recording/printout")
Signed-off-by: Pavankumar Kondeti <pkondeti@codeaurora.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Pull networking fixes from David Miller:
1) Fix some potentially uninitialized variables and use-after-free in
kvaser_usb can drier, from Jimmy Assarsson.
2) Fix leaks in qed driver, from Denis Bolotin.
3) Socket leak in l2tp, from Xin Long.
4) RSS context allocation fix in bnxt_en from Michael Chan.
5) Fix cxgb4 build errors, from Ganesh Goudar.
6) Route leaks in ipv6 when removing exceptions, from Xin Long.
7) Memory leak in IDR allocation handling of act_pedit, from Davide
Caratti.
8) Use-after-free of bridge vlan stats, from Nikolay Aleksandrov.
9) When MTU is locked, do not force DF bit on ipv4 tunnels. From
Sabrina Dubroca.
10) When NAPI cached skb is reused, we must set it to the proper initial
state which includes skb->pkt_type. From Eric Dumazet.
11) Lockdep and non-linear SKB handling fix in tipc from Jon Maloy.
12) Set RX queue properly in various tuntap receive paths, from Matthew
Cover.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
tuntap: fix multiqueue rx
ipv6: Fix PMTU updates for UDP/raw sockets in presence of VRF
tipc: don't assume linear buffer when reading ancillary data
tipc: fix lockdep warning when reinitilaizing sockets
net-gro: reset skb->pkt_type in napi_reuse_skb()
tc-testing: tdc.py: Guard against lack of returncode in executed command
tc-testing: tdc.py: ignore errors when decoding stdout/stderr
ip_tunnel: don't force DF when MTU is locked
MAINTAINERS: Add entry for CAKE qdisc
net: bridge: fix vlan stats use-after-free on destruction
socket: do a generic_file_splice_read when proto_ops has no splice_read
net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
Revert "net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs"
net: phy: mdio-gpio: Fix working over slow can_sleep GPIOs
net/sched: act_pedit: fix memory leak when IDR allocation fails
net: lantiq: Fix returned value in case of error in 'xrx200_probe()'
ipv6: fix a dst leak when removing its exception
net: mvneta: Don't advertise 2.5G modes
drivers/net/ethernet/qlogic/qed/qed_rdma.h: fix typo
net/mlx4: Fix UBSAN warning of signed integer overflow
...
This lib tracks objects which could be of two types:
1) root object
2) nested object - with a "delta" which differentiates it from
the associated root object
The objects are tracked by a hashtable and reference-counted. User is
responsible of implementing callbacks to create/destroy root entity
related to each root object and callback to create/destroy nested object
delta.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The life-checking function, which is used by kAFS to make sure that a call
is still live in the event of a pending signal, only samples the received
packet serial number counter; it doesn't actually provoke a change in the
counter, rather relying on the server to happen to give us a packet in the
time window.
Fix this by adding a function to force a ping to be transmitted.
kAFS then keeps track of whether there's been a stall, and if so, uses the
new function to ping the server, resetting the timeout to allow the reply
to come back.
If there's a stall, a ping and the call is *still* stalled in the same
place after another period, then the call will be aborted.
Fixes: bc5e3a546d ("rxrpc: Use MSG_WAITALL to tell sendmsg() to temporarily ignore signals")
Fixes: f4d15fb6f9 ("rxrpc: Provide functions for allowing cleaner handling of signals")
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When copying to the latency type, we should be passing LATENCY_TYPE_LEN,
not DOMAIN_LEN (this isn't a problem in practice because we only pass
"total" or "I/O"). Fix it by changing all of the strlcpy() calls to use
sizeof().
Fixes: 6c3b7af1c9 ("kyber: add tracepoints")
Reported-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Tested-by: Jordan Glover <Golden_Miller83@protonmail.ch>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pull AFS updates from Al Viro:
"AFS series, with some iov_iter bits included"
* 'work.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (26 commits)
missing bits of "iov_iter: Separate type from direction and use accessor functions"
afs: Probe multiple fileservers simultaneously
afs: Fix callback handling
afs: Eliminate the address pointer from the address list cursor
afs: Allow dumping of server cursor on operation failure
afs: Implement YFS support in the fs client
afs: Expand data structure fields to support YFS
afs: Get the target vnode in afs_rmdir() and get a callback on it
afs: Calc callback expiry in op reply delivery
afs: Fix FS.FetchStatus delivery from updating wrong vnode
afs: Implement the YFS cache manager service
afs: Remove callback details from afs_callback_break struct
afs: Commit the status on a new file/dir/symlink
afs: Increase to 64-bit volume ID and 96-bit vnode ID for YFS
afs: Don't invoke the server to read data beyond EOF
afs: Add a couple of tracepoints to log I/O errors
afs: Handle EIO from delivery function
afs: Fix TTL on VL server and address lists
afs: Implement VL server rotation
afs: Improve FS server rotation error handling
...
Merge updates from Andrew Morton:
- a few misc things
- ocfs2 updates
- most of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (132 commits)
hugetlbfs: dirty pages as they are added to pagecache
mm: export add_swap_extent()
mm: split SWP_FILE into SWP_ACTIVATED and SWP_FS
tools/testing/selftests/vm/map_fixed_noreplace.c: add test for MAP_FIXED_NOREPLACE
mm: thp: relocate flush_cache_range() in migrate_misplaced_transhuge_page()
mm: thp: fix mmu_notifier in migrate_misplaced_transhuge_page()
mm: thp: fix MADV_DONTNEED vs migrate_misplaced_transhuge_page race condition
mm/kasan/quarantine.c: make quarantine_lock a raw_spinlock_t
mm/gup: cache dev_pagemap while pinning pages
Revert "x86/e820: put !E820_TYPE_RAM regions into memblock.reserved"
mm: return zero_resv_unavail optimization
mm: zero remaining unavailable struct pages
tools/testing/selftests/vm/gup_benchmark.c: add MAP_HUGETLB option
tools/testing/selftests/vm/gup_benchmark.c: add MAP_SHARED option
tools/testing/selftests/vm/gup_benchmark.c: allow user specified file
tools/testing/selftests/vm/gup_benchmark.c: fix 'write' flag usage
mm/gup_benchmark.c: add additional pinning methods
mm/gup_benchmark.c: time put_page()
mm: don't raise MEMCG_OOM event due to failed high-order allocation
mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock
...
Refaults happen during transitions between workingsets as well as in-place
thrashing. Knowing the difference between the two has a range of
applications, including measuring the impact of memory shortage on the
system performance, as well as the ability to smarter balance pressure
between the filesystem cache and the swap-backed workingset.
During workingset transitions, inactive cache refaults and pushes out
established active cache. When that active cache isn't stale, however,
and also ends up refaulting, that's bonafide thrashing.
Introduce a new page flag that tells on eviction whether the page has been
active or not in its lifetime. This bit is then stored in the shadow
entry, to classify refaults as transitioning or thrashing.
How many page->flags does this leave us with on 32-bit?
20 bits are always page flags
21 if you have an MMU
23 with the zone bits for DMA, Normal, HighMem, Movable
29 with the sparsemem section bits
30 if PAE is enabled
31 with this patch.
So on 32-bit PAE, that leaves 1 bit for distinguishing two NUMA nodes. If
that's not enough, the system can switch to discontigmem and re-gain the 6
or 7 sparsemem section bits.
Link: http://lkml.kernel.org/r/20180828172258.3185-3-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Daniel Drake <drake@endlessm.com>
Tested-by: Suren Baghdasaryan <surenb@google.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Weiner <jweiner@fb.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Peter Enderborg <peter.enderborg@sony.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Highlights include:
Stable fixes:
- Fix the NFSv4.1 r/wsize sanity checking
- Reset the RPC/RDMA credit grant properly after a disconnect
- Fix a missed page unlock after pg_doio()
Features and optimisations:
- Overhaul of the RPC client socket code to eliminate a locking bottleneck
and reduce the latency when transmitting lots of requests in parallel.
- Allow parallelisation of the RPCSEC_GSS encoding of an RPC request.
- Convert the RPC client socket receive code to use iovec_iter() for
improved efficiency.
- Convert several NFS and RPC lookup operations to use RCU instead of
taking global locks.
- Avoid the need for BH-safe locks in the RPC/RDMA back channel.
Bugfixes and cleanups:
- Fix lock recovery during NFSv4 delegation recalls
- Fix the NFSv4 + NFSv4.1 "lookup revalidate + open file" case.
- Fixes for the RPC connection metrics
- Various RPC client layer cleanups to consolidate stream based sockets
- RPC/RDMA connection cleanups
- Simplify the RPC/RDMA cleanup after memory operation failures
- Clean ups for NFS v4.2 copy completion and NFSv4 open state reclaim.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJb0zW8AAoJEA4mA3inWBJcmccP/0hkeNFk2y4tErit1lq4TYDs
sMkFv0rjhBkxWbZFmGJfAulbQ5cu+GwTBqqmhm67rE+2C+vevrE4JRfDFmcEGpio
lE/2uJdqu1UlIOiovyjk0jMetUuf2LTS82vloPP/z5mmvgQ4S1NSajUGuPbjQR2S
AtTj0XGI5e1nm8PZDftbomcxD5HUYaITQEDCyrm8a7xX8OZ5ySXakzdgXuNM5TgI
MPjcpOFvIARwF4MhovYFZtSInB5XiZYSiTAB03deVgy38JDsSPeQgwUVWjErrq/K
V/6kOg8EYd0uNFmUCwKX/ecbvAlnbfqAMX+YcL0ZrbVk0pBqxVvoGVXK8ex8Wbm1
eL9tyYK81Sc7TliXr2+R22CHDcMTTMImFLix5Gp6mk2Fd5TpMydV9c9S7NBCHYB4
rgcM9brgutFF6N8zqdBpa1FVH3cBE1A428/90kp4XU/kdQlxIvYBLBCylI25POEL
7oqhcJxljFLWXZdhmH7t3WV0RWOzITZHEp9foL8p6yAPzOSWPF98OlQU+FmLj3Y4
EZ61qLXIRxYpLf1aZh7GNKms5ZzOhKiZgw43UL3pl4xKhk2i9061IUKGSEHgIklk
BX34dmCALDlapt+Ggcm1uIe9BLCc4KADfixqNfr91dSOycFM2RajsSZCPrP9Gx8G
t8rYl8x+lLZ5ZxLkdTUP
=Fn8z
-----END PGP SIGNATURE-----
Merge tag 'nfs-for-4.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client updates from Trond Myklebust:
"Highlights include:
Stable fixes:
- Fix the NFSv4.1 r/wsize sanity checking
- Reset the RPC/RDMA credit grant properly after a disconnect
- Fix a missed page unlock after pg_doio()
Features and optimisations:
- Overhaul of the RPC client socket code to eliminate a locking
bottleneck and reduce the latency when transmitting lots of
requests in parallel.
- Allow parallelisation of the RPCSEC_GSS encoding of an RPC request.
- Convert the RPC client socket receive code to use iovec_iter() for
improved efficiency.
- Convert several NFS and RPC lookup operations to use RCU instead of
taking global locks.
- Avoid the need for BH-safe locks in the RPC/RDMA back channel.
Bugfixes and cleanups:
- Fix lock recovery during NFSv4 delegation recalls
- Fix the NFSv4 + NFSv4.1 "lookup revalidate + open file" case.
- Fixes for the RPC connection metrics
- Various RPC client layer cleanups to consolidate stream based
sockets
- RPC/RDMA connection cleanups
- Simplify the RPC/RDMA cleanup after memory operation failures
- Clean ups for NFS v4.2 copy completion and NFSv4 open state
reclaim"
* tag 'nfs-for-4.20-1' of git://git.linux-nfs.org/projects/trondmy/linux-nfs: (97 commits)
SUNRPC: Convert the auth cred cache to use refcount_t
SUNRPC: Convert auth creds to use refcount_t
SUNRPC: Simplify lookup code
SUNRPC: Clean up the AUTH cache code
NFS: change sign of nfs_fh length
sunrpc: safely reallow resvport min/max inversion
nfs: remove redundant call to nfs_context_set_write_error()
nfs: Fix a missed page unlock after pg_doio()
SUNRPC: Fix a compile warning for cmpxchg64()
NFSv4.x: fix lock recovery during delegation recall
SUNRPC: use cmpxchg64() in gss_seq_send64_fetch_and_inc()
xprtrdma: Squelch a sparse warning
xprtrdma: Clean up xprt_rdma_disconnect_inject
xprtrdma: Add documenting comments
xprtrdma: Report when there were zero posted Receives
xprtrdma: Move rb_flags initialization
xprtrdma: Don't disable BH's in backchannel server
xprtrdma: Remove memory address of "ep" from an error message
xprtrdma: Rename rpcrdma_qp_async_error_upcall
xprtrdma: Simplify RPC wake-ups on connect
...
Dispatching a report zones command through the request queue is a major
pain due to the command reply payload rewriting necessary. Given that
blkdev_report_zones() is executing everything synchronously, implement
report zones as a block device file operation instead, allowing major
simplification of the code in many places.
sd, null-blk, dm-linear and dm-flakey being the only block device
drivers supporting exposing zoned block devices, these drivers are
modified to provide the device side implementation of the
report_zones() block device file operation.
For device mappers, a new report_zones() target type operation is
defined so that the upper block layer calls blkdev_report_zones() can
be propagated down to the underlying devices of the dm targets.
Implementation for this new operation is added to the dm-linear and
dm-flakey targets.
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
[Damien]
* Changed method block_device argument to gendisk
* Various bug fixes and improvements
* Added support for null_blk, dm-linear and dm-flakey.
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
allocation for bigalloc file systems; fix up some syzbot-detected
races in EXT4_IOC_MOVE_EXT, EXT4_IOC_SWAP_BOOT, and ext4_remount; and
a few other miscellaneous bugs and optimizations.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAlvQYEcACgkQ8vlZVpUN
gaOPYAgAh0BF7mTRnHAp/qkR5ZhDi3ecb3TpNlnpfzoDqQhPYETFisc18DD4HwTj
wctwzSdYxYodeuPIK+R2bBzUy3FuSwtlER9cdr1ilcrUYPZHbir1rPPfTNb/oDGx
WNcd/aulLjuU1eKDODowqMOF2HDchiJHqJqMBa+LfCHck1x/bt2uqdjNA5A1p5AV
lp07DoXT54q5rWJDaXpbxTShWKhzHlRKbB9PKEvMHgPNl9sn5oRReRMKAW+WkT91
e3mfy/GhzhugdWxYUg2oAn3dbqYkkAjW96WnBhCQHioW9ASphjl7yBi1LWh2aPA4
haGxe5W3En8q678ZVtTVNJOyvbW81Q==
=VgdS
-----END PGP SIGNATURE-----
Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 updates from Ted Ts'o:
- further restructure ext4 documentation
- fix up ext4's delayed allocation for bigalloc file systems
- fix up some syzbot-detected races in EXT4_IOC_MOVE_EXT,
EXT4_IOC_SWAP_BOOT, and ext4_remount
- ... and a few other miscellaneous bugs and optimizations.
* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (21 commits)
ext4: fix use-after-free race in ext4_remount()'s error path
ext4: cache NULL when both default_acl and acl are NULL
docs: promote the ext4 data structures book to top level
docs: move ext4 administrative docs to admin-guide/
jbd2: fix use after free in jbd2_log_do_checkpoint()
ext4: propagate error from dquot_initialize() in EXT4_IOC_FSSETXATTR
ext4: fix setattr project check in fssetxattr ioctl
docs: make ext4 readme tables readable
docs: fix ext4 documentation table formatting problems
docs: generate a separate ext4 pdf file from the documentation
ext4: convert fault handler to use vm_fault_t type
ext4: initialize retries variable in ext4_da_write_inline_data_begin()
ext4: fix EXT4_IOC_SWAP_BOOT
ext4: fix build error when DX_DEBUG is defined
ext4: fix argument checking in EXT4_IOC_MOVE_EXT
ext4: fix reserved cluster accounting at page invalidation time
ext4: adjust reserved cluster count when removing extents
ext4: reduce reserved cluster count by number of allocated clusters
ext4: fix reserved cluster accounting at delayed write time
ext4: add new pending reservation mechanism
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAlvN/4kACgkQxWXV+ddt
WDtA0Q//fIKXjjxDLa/EM4fz63urfczoxq/nVRVXveoTtAjT+QOTIY0iRblLhKQg
ehy86ygYmwfnBjCzZRTIjtvFJtt93kWH3VtqGAOMcvA/sMGP68CDn8STyIaCsw2o
UN0qEIYcQ5wj5aJlJhlWZgx9bvjK3xkMuNO/uPWz0M0OL+KTk6O265FjIot0M6Pa
XsihJCn9ZrWL9peevGDUXdoPuXQKaU8aW47qe+989Ya0oPD11Knpn75J+t3U07/h
sty6pFIQMCBSKOjXnCKhysv3wSyewzRyMznVPz6EhRfzn7od5GLmvG7cANiWcV6K
jep5Hd7cJq/BeIwpTSAxh+ygxbce8EGIm9NyUPAPXDPtfgMv1Zf2QLEPZhi729xk
OpSF+eOGuRdSCur7ng629LqOANjb+D1939QKrzIwO5SC3xSZ5Ht258IWcJ+FlDNz
Cfxk8b3rWhsS9qSSmiq3kdQ3ECEOzFBYx7d8m2FZ2z3mViA1VdIROHhX97VCAcVu
X1dq5kUycyioGak86Ce/cexpmkwcwf12ypCZWz7tL+pfgDKcAKFHjA0vNLzYjZEy
mxHZijjJXrg8vWTqMCqgDhIDQVzmaOjFo7upXc+5MsL8sbC71V38QvTZ3F6jcnAa
kmiE8lDRQmnM38/5U1EEjROHpBaRLbpSfZvQcfSOGNkaycb2cFI=
=g5+b
-----END PGP SIGNATURE-----
Merge tag 'for-4.20-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"This is the first batch with fixes and some nice performance
improvements.
Preliminary results show eg. more files/sec in fsmark, better perf on
multi-threaded workloads (filebench, dbench), fewer context switches
and overall better memory allocation characteristics (multiple
benchmarks).
Apart from general performance, there's an improvement for qgroups +
balance workload that's been troubling our users.
Note for stable: there are 20+ patches tagged for stable, out of 90.
Not all of them apply cleanly on all stable versions but the conflicts
are mostly due to simple cleanups and resolving should be obvious. The
fixes are otherwise independent.
Performance improvements:
- transition between blocking and spinning modes of path is gone,
which originally resulted to more unnecessary wakeups and updates
to the path locks, the effects are measurable and improve latency
and scalability
- qgroups: first batch of changes that should speedup balancing with
qgroups on, skip quota accounting on unchanged subtrees, overall
gain is about 30+% in runtime
- use rb-tree with cached first node for several structures, small
improvement to avoid pointer chasing
Fixes:
- trim
- fix: some blockgroups could have been missed if their logical
address was past the total filesystem size (ie. after a lot of
balancing)
- better error reporting, after processing blockgroups and whole
device
- fix: continue trimming block groups after an error is
encountered
- check for trim support of the device earlier and avoid some
unnecessary work
- less interaction with transaction commit that improves latency
on slower storage (eg. image files over NFS)
- fsync
- fix warning when replaying log after fsync of a O_TMPFILE
- fix wrong dentries after fsync of file that got its parent
replaced
- qgroups: fix rescan that might misc some dirty groups
- don't clean dirty pages during buffered writes, this could lead to
lost updates in some corner cases
- some block groups could have been delayed in creation, if the
allocation triggered another one
- error handling improvements
Cleanups:
- removed unused struct members and variables
- function return type cleanups
- delayed refs code refactoring
- protect against deadlock that could be caused by crafted image that
tries to allocate from a tree that's locked already"
* tag 'for-4.20-part1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (93 commits)
btrfs: switch return_bigger to bool in find_ref_head
btrfs: remove fs_info from btrfs_should_throttle_delayed_refs
btrfs: remove fs_info from btrfs_check_space_for_delayed_refs
btrfs: delayed-ref: pass delayed_refs directly to btrfs_delayed_ref_lock
btrfs: delayed-ref: pass delayed_refs directly to btrfs_select_ref_head
btrfs: qgroup: move the qgroup->members check out from (!qgroup)'s else branch
btrfs: relocation: Remove redundant tree level check
btrfs: relocation: Cleanup while loop using rbtree_postorder_for_each_entry_safe
btrfs: qgroup: Avoid calling qgroup functions if qgroup is not enabled
Btrfs: fix wrong dentries after fsync of file that got its parent replaced
Btrfs: fix warning when replaying log after fsync of a tmpfile
btrfs: drop min_size from evict_refill_and_join
btrfs: assert on non-empty delayed iputs
btrfs: make sure we create all new block groups
btrfs: reset max_extent_size on clear in a bitmap
btrfs: protect space cache inode alloc with GFP_NOFS
btrfs: release metadata before running delayed refs
Btrfs: kill btrfs_clear_path_blocking
btrfs: dev-replace: remove pointless assert in write unlock
btrfs: dev-replace: move replace members out of fs_info
...
Pull siginfo updates from Eric Biederman:
"I have been slowly sorting out siginfo and this is the culmination of
that work.
The primary result is in several ways the signal infrastructure has
been made less error prone. The code has been updated so that manually
specifying SEND_SIG_FORCED is never necessary. The conversion to the
new siginfo sending functions is now complete, which makes it
difficult to send a signal without filling in the proper siginfo
fields.
At the tail end of the patchset comes the optimization of decreasing
the size of struct siginfo in the kernel from 128 bytes to about 48
bytes on 64bit. The fundamental observation that enables this is by
definition none of the known ways to use struct siginfo uses the extra
bytes.
This comes at the cost of a small user space observable difference.
For the rare case of siginfo being injected into the kernel only what
can be copied into kernel_siginfo is delivered to the destination, the
rest of the bytes are set to 0. For cases where the signal and the
si_code are known this is safe, because we know those bytes are not
used. For cases where the signal and si_code combination is unknown
the bits that won't fit into struct kernel_siginfo are tested to
verify they are zero, and the send fails if they are not.
I made an extensive search through userspace code and I could not find
anything that would break because of the above change. If it turns out
I did break something it will take just the revert of a single change
to restore kernel_siginfo to the same size as userspace siginfo.
Testing did reveal dependencies on preferring the signo passed to
sigqueueinfo over si->signo, so bit the bullet and added the
complexity necessary to handle that case.
Testing also revealed bad things can happen if a negative signal
number is passed into the system calls. Something no sane application
will do but something a malicious program or a fuzzer might do. So I
have fixed the code that performs the bounds checks to ensure negative
signal numbers are handled"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (80 commits)
signal: Guard against negative signal numbers in copy_siginfo_from_user32
signal: Guard against negative signal numbers in copy_siginfo_from_user
signal: In sigqueueinfo prefer sig not si_signo
signal: Use a smaller struct siginfo in the kernel
signal: Distinguish between kernel_siginfo and siginfo
signal: Introduce copy_siginfo_from_user and use it's return value
signal: Remove the need for __ARCH_SI_PREABLE_SIZE and SI_PAD_SIZE
signal: Fail sigqueueinfo if si_signo != sig
signal/sparc: Move EMT_TAGOVF into the generic siginfo.h
signal/unicore32: Use force_sig_fault where appropriate
signal/unicore32: Generate siginfo in ucs32_notify_die
signal/unicore32: Use send_sig_fault where appropriate
signal/arc: Use force_sig_fault where appropriate
signal/arc: Push siginfo generation into unhandled_exception
signal/ia64: Use force_sig_fault where appropriate
signal/ia64: Use the force_sig(SIGSEGV,...) in ia64_rt_sigreturn
signal/ia64: Use the generic force_sigsegv in setup_frame
signal/arm/kvm: Use send_sig_mceerr
signal/arm: Use send_sig_fault where appropriate
signal/arm: Use force_sig_fault where appropriate
...
Pull networking updates from David Miller:
1) Add VF IPSEC offload support in ixgbe, from Shannon Nelson.
2) Add zero-copy AF_XDP support to i40e, from Björn Töpel.
3) All in-tree drivers are converted to {g,s}et_link_ksettings() so we
can get rid of the {g,s}et_settings ethtool callbacks, from Michal
Kubecek.
4) Add software timestamping to veth driver, from Michael Walle.
5) More work to make packet classifiers and actions lockless, from Vlad
Buslov.
6) Support sticky FDB entries in bridge, from Nikolay Aleksandrov.
7) Add ipv6 version of IP_MULTICAST_ALL sockopt, from Andre Naujoks.
8) Support batching of XDP buffers in vhost_net, from Jason Wang.
9) Add flow dissector BPF hook, from Petar Penkov.
10) i40e vf --> generic iavf conversion, from Jesse Brandeburg.
11) Add NLA_REJECT netlink attribute policy type, to signal when users
provide attributes in situations which don't make sense. From
Johannes Berg.
12) Switch TCP and fair-queue scheduler over to earliest departure time
model. From Eric Dumazet.
13) Improve guest receive performance by doing rx busy polling in tx
path of vhost networking driver, from Tonghao Zhang.
14) Add per-cgroup local storage to bpf
15) Add reference tracking to BPF, from Joe Stringer. The verifier can
now make sure that references taken to objects are properly released
by the program.
16) Support in-place encryption in TLS, from Vakul Garg.
17) Add new taprio packet scheduler, from Vinicius Costa Gomes.
18) Lots of selftests additions, too numerous to mention one by one here
but all of which are very much appreciated.
19) Support offloading of eBPF programs containing BPF to BPF calls in
nfp driver, frm Quentin Monnet.
20) Move dpaa2_ptp driver out of staging, from Yangbo Lu.
21) Lots of u32 classifier cleanups and simplifications, from Al Viro.
22) Add new strict versions of netlink message parsers, and enable them
for some situations. From David Ahern.
23) Evict neighbour entries on carrier down, also from David Ahern.
24) Support BPF sk_msg verdict programs with kTLS, from Daniel Borkmann
and John Fastabend.
25) Add support for filtering route dumps, from David Ahern.
26) New igc Intel driver for 2.5G parts, from Sasha Neftin et al.
27) Allow vxlan enslavement to bridges in mlxsw driver, from Ido
Schimmel.
28) Add queue and stack map types to eBPF, from Mauricio Vasquez B.
29) Add back byte-queue-limit support to r8169, with all the bug fixes
in other areas of the driver it works now! From Florian Westphal and
Heiner Kallweit.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2147 commits)
tcp: add tcp_reset_xmit_timer() helper
qed: Fix static checker warning
Revert "be2net: remove desc field from be_eq_obj"
Revert "net: simplify sock_poll_wait"
net: socionext: Reset tx queue in ndo_stop
net: socionext: Add dummy PHY register read in phy_write()
net: socionext: Stop PHY before resetting netsec
net: stmmac: Set OWN bit for jumbo frames
arm64: dts: stratix10: Support Ethernet Jumbo frame
tls: Add maintainers
net: ethernet: ti: cpsw: unsync mcast entries while switch promisc mode
octeontx2-af: Support for NIXLF's UCAST/PROMISC/ALLMULTI modes
octeontx2-af: Support for setting MAC address
octeontx2-af: Support for changing RSS algorithm
octeontx2-af: NIX Rx flowkey configuration for RSS
octeontx2-af: Install ucast and bcast pkt forwarding rules
octeontx2-af: Add LMAC channel info to NIXLF_ALLOC response
octeontx2-af: NPC MCAM and LDATA extract minimal configuration
octeontx2-af: Enable packet length and csum validation
octeontx2-af: Support for VTAG strip and capture
...
Send probes to all the unprobed fileservers in a fileserver list on all
addresses simultaneously in an attempt to find out the fastest route whilst
not getting stuck for 20s on any server or address that we don't get a
reply from.
This alleviates the problem whereby attempting to access a new server can
take a long time because the rotation algorithm ends up rotating through
all servers and addresses until it finds one that responds.
Signed-off-by: David Howells <dhowells@redhat.com>
Implement support for talking to YFS-variant fileservers in the cache
manager and the filesystem client. These implement upgraded services on
the same port as their AFS services.
YFS fileservers provide expanded capabilities over AFS.
Signed-off-by: David Howells <dhowells@redhat.com>
Increase the sizes of the volume ID to 64 bits and the vnode ID (inode
number equivalent) to 96 bits to allow the support of YFS.
This requires the iget comparator to check the vnode->fid rather than i_ino
and i_generation as i_ino is not sufficiently capacious. It also requires
this data to be placed into the vnode cache key for fscache.
For the moment, just discard the top 32 bits of the vnode ID when returning
it though stat.
Signed-off-by: David Howells <dhowells@redhat.com>
afs_extract_data sets up a temporary iov_iter and passes it to AF_RXRPC
each time it is called to describe the remaining buffer to be filled.
Instead:
(1) Put an iterator in the afs_call struct.
(2) Set the iterator for each marshalling stage to load data into the
appropriate places. A number of convenience functions are provided to
this end (eg. afs_extract_to_buf()).
This iterator is then passed to afs_extract_data().
(3) Use the new ITER_DISCARD iterator to discard any excess data provided
by FetchData.
Signed-off-by: David Howells <dhowells@redhat.com>
Include the site of detection of AFS protocol errors in trace lines to
better be able to determine what went wrong.
Signed-off-by: David Howells <dhowells@redhat.com>
Pull scheduler updates from Ingo Molnar:
"The main changes are:
- Migrate CPU-intense 'misfit' tasks on asymmetric capacity systems,
to better utilize (much) faster 'big core' CPUs. (Morten Rasmussen,
Valentin Schneider)
- Topology handling improvements, in particular when CPU capacity
changes and related load-balancing fixes/improvements (Morten
Rasmussen)
- ... plus misc other improvements, fixes and updates"
* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
sched/completions/Documentation: Add recommendation for dynamic and ONSTACK completions
sched/completions/Documentation: Clean up the document some more
sched/completions/Documentation: Fix a couple of punctuation nits
cpu/SMT: State SMT is disabled even with nosmt and without "=force"
sched/core: Fix comment regarding nr_iowait_cpu() and get_iowait_load()
sched/fair: Remove setting task's se->runnable_weight during PELT update
sched/fair: Disable LB_BIAS by default
sched/pelt: Fix warning and clean up IRQ PELT config
sched/topology: Make local variables static
sched/debug: Use symbolic names for task state constants
sched/numa: Remove unused numa_stats::nr_running field
sched/numa: Remove unused code from update_numa_stats()
sched/debug: Explicitly cast sched_feat() to bool
sched/core: Disable SD_PREFER_SIBLING on asymmetric CPU capacity domains
sched/fair: Don't move tasks to lower capacity CPUs unless necessary
sched/fair: Set rq->rd->overload when misfit
sched/fair: Wrap rq->rd->overload accesses with READ/WRITE_ONCE()
sched/core: Change root_domain->overload type to int
sched/fair: Change 'prefer_sibling' type to bool
sched/fair: Kick nohz balance if rq->misfit_task_load
...
Pull RCU updates from Ingo Molnar:
"The biggest change in this cycle is the conclusion of the big
'simplify RCU to two primary flavors' consolidation work - i.e.
there's a single RCU flavor for any kernel variant (PREEMPT and
!PREEMPT):
- Consolidate the RCU-bh, RCU-preempt, and RCU-sched flavors into a
single flavor similar to RCU-sched in !PREEMPT kernels and into a
single flavor similar to RCU-preempt (but also waiting on
preempt-disabled sequences of code) in PREEMPT kernels.
This branch also includes a refactoring of
rcu_{nmi,irq}_{enter,exit}() from Byungchul Park.
- Now that there is only one RCU flavor in any given running kernel,
the many "rsp" pointers are no longer required, and this cleanup
series removes them.
- This branch carries out additional cleanups made possible by the
RCU flavor consolidation, including inlining now-trivial
functions, updating comments and definitions, and removing
now-unneeded rcutorture scenarios.
- Now that there is only one flavor of RCU in any running kernel,
there is also only on rcu_data structure per CPU. This means that
the rcu_dynticks structure can be merged into the rcu_data
structure, a task taken on by this branch. This branch also
contains a -rt-related fix from Mike Galbraith.
There were also other updates:
- Documentation updates, including some good-eye catches from Joel
Fernandes.
- SRCU updates, most notably changes enabling call_srcu() to be
invoked very early in the boot sequence.
- Torture-test updates, including some preliminary work towards
making rcutorture better able to find problems that result in
insufficient grace-period forward progress.
- Initial changes to RCU to better promote forward progress of grace
periods, including fixing a bug found by Marius Hillenbrand and
David Woodhouse, with the fix suggested by Peter Zijlstra"
* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (140 commits)
srcu: Make early-boot call_srcu() reuse workqueue lists
rcutorture: Test early boot call_srcu()
srcu: Make call_srcu() available during very early boot
rcu: Convert rcu_state.ofl_lock to raw_spinlock_t
rcu: Remove obsolete ->dynticks_fqs and ->cond_resched_completed
rcu: Switch ->dynticks to rcu_data structure, remove rcu_dynticks
rcu: Switch dyntick nesting counters to rcu_data structure
rcu: Switch urgent quiescent-state requests to rcu_data structure
rcu: Switch lazy counts to rcu_data structure
rcu: Switch last accelerate/advance to rcu_data structure
rcu: Switch ->tick_nohz_enabled_snap to rcu_data structure
rcu: Merge rcu_dynticks structure into rcu_data structure
rcu: Remove unused rcu_dynticks_snap() from Tiny RCU
rcu: Convert "1UL << x" to "BIT(x)"
rcu: Avoid resched_cpu() when rescheduling the current CPU
rcu: More aggressively enlist scheduler aid for nohz_full CPUs
rcu: Compute jiffies_till_sched_qs from other kernel parameters
rcu: Provide functions for determining if call_rcu() has been invoked
rcu: Eliminate ->rcu_qs_ctr from the rcu_dynticks structure
rcu: Motivate Tiny RCU forward progress
...
- Add support for trace events to hwmon core
- Add support for NCT6797D, NCT6798D, MAX31725/6, LTM4686
- Support all AMD Family 15h Model 6xh and Model 7xh processors
in k10temp driver
- Convert ina3221 driver to _info API
- Fixes, cleanups, and improvements in various drivers.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIbBAABAgAGBQJbyVR3AAoJEMsfJm/On5mB7zgP92NONVqvIBBFRj7ohjhz4iya
y/8ITclQvxMzgrBTuYS9avA9NACKtaerfpiS8m4XjrKAt66/x2h6yTGRVKEDTqXA
Qs9OZG83tocBYqZ3EI3FtMhbx3eOIdCj+v34ARNqPZSXJGkpiKa+9BmjR2VuMCtG
ONcnk66os4g7rjgsBqYUqCkAQkse/Oys6iMPpg4VfZlL6/sKWCglsQVCqplW7dFR
wxZtJqKYHNE4sddtT67cVaxKcuNdt96EmEoJCGirH0B2wW7FPefR4sYO9cmykDi+
J6b7VErHWSi75LrXVZaXHllkau3NvZDCoGyVbVPu98w4TzjylCXciskOmXdTQkso
57I2aPMUa4VLde27p/a3MLUXhswFng2w62rz2l/BLlWq4KbTorPZxnKrHT8EpU7T
eQrLC5Q+mdJCRdhcDQEcx8+SKtdO9UFLUFD5xb2D74ZJ4N2DYYFa62Vr9I6Mk+AV
aZa5wLQW6W5UY8+AcbhK2AJE60LiPaJsuobiLLwZ/5cB2M33Hgi0ySDkq/Y4fFjA
t/D5iWDVhbwbK15baJyJEAlNQAM+Nr284RGfWTfHhqKxmx3/fIKw5OBjsRzIP5KV
NN5JA8s8tuLP10MIY+jXRTu2knkTKQXgWp8YLuS8gI0meCiGt2fZ8kXXW5b9T5+F
sH7cZuCGf+PRkEeYp8o=
=Urmz
-----END PGP SIGNATURE-----
Merge tag 'hwmon-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon updates from Guenter Roeck:
- Add support for trace events to hwmon core
- Add support for NCT6797D, NCT6798D, MAX31725/6, LTM4686
- Support all AMD Family 15h Model 6xh and Model 7xh processors in
k10temp driver
- Convert ina3221 driver to _info API
- Fixes, cleanups, and improvements in various drivers
* tag 'hwmon-for-v4.20' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: (46 commits)
hwmon: (pmbus) Fix page count auto-detection.
hwmon: (pmbus) remove redundant 'default n' from Kconfig
hwmon: (core) Add trace events to _attr_show/store functions
hwmon: (ina3221) Use _info API to register hwmon device
hwmon: (npcm-750-pwm-fan) Change initial pwm target to 255
hwmon: (ina3221) Validate shunt resistor value from DT
hwmon: (tmp421) make const array 'names' static
hwmon: (core) Add hwmon_in_enable attribute
hwmon: (ina3221) mark PM functions as __maybe_unused
hwmon: (ina3221) Read channel input source info from DT
dt-bindings: hwmon: Add ina3221 documentation
hwmon: (ina3221) Add suspend and resume functions
hwmon: (ina3221) Fix INA3221_CONFIG_MODE macros
hwmon: (ina3221) Add INA3221_CONFIG to volatile_table
MAINTAINERS: Update PMBUS maintainer entry
hwmon: (pwm-fan) Set fan speed to 0 on suspend
hwmon: (pwm-fan) Silence error on probe deferral
hwmon: (scpi-hwmon) remove redundant continue
hwmon: (nct6775) Add support for NCT6798D
hwmon: (nct6775) Add support for NCT6797D
...
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlvNQKgQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgps+8D/9Iy6YIeoPwN10gYsqIh0P2fS3wKzL3kiww
3vFsWO78PzgLxUlNmB7teLtNFc/R5mi8becZmAdvs9za5YFZk56o3Ifv1x+e+z00
VY1/gxhiJD6suLeJ6lECnERGDaiWOZVRMo2TE17vxYGW6GGaa0Ts6PUUXmpla1u5
WKctgt0Qv9WVNyiIdLdeHqzKJwsSSwNTt8fK7eFhy3x8e0CwJr+GtXckbbW3LFkY
lug0npsTli3EmEPMovZhd25SjZmTk5GTM+ADZQ7Tnv5KXoDWB9jn6TcCSAi3G+5d
5WUVwfnDyYJiH8qvlg5tRJ690muIy3xMOmpr7QBQ0YnR/LQ3EW+1CVfqD+qimgLH
TXzlREXQpBP3YlxSDS5nddz4o5z84GZmC9B/43ujPaZKIQ6eBXYdkmQH7tPtSugm
C6VGomR5tHotjxIiAsexh/5hAus+wW8bObKGTPTyINT0ub3XNclwCKLh26CgI9ie
WvbS9g3j/KPvu/7s6weZpgD+cks0YdWe/XdXXxiHwsGI9h3J2aJna5RQt1rKWDm5
wGCgbc/B8eSwiWx+GXlqdB9/Dy/bGXOnSTDnKpEVl1f5zNjeLwUKXbjvkMefWs4m
jEIcquuDETORY+ZYEfa5YbmS4Lhskr0kzMVTVkZ++81tAWpSCU9Xh3IHrR8TNpt+
J0oh0FHBDg==
=LRTT
-----END PGP SIGNATURE-----
Merge tag 'for-4.20/block-20181021' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:
"This is the main pull request for block changes for 4.20. This
contains:
- Series enabling runtime PM for blk-mq (Bart).
- Two pull requests from Christoph for NVMe, with items such as;
- Better AEN tracking
- Multipath improvements
- RDMA fixes
- Rework of FC for target removal
- Fixes for issues identified by static checkers
- Fabric cleanups, as prep for TCP transport
- Various cleanups and bug fixes
- Block merging cleanups (Christoph)
- Conversion of drivers to generic DMA mapping API (Christoph)
- Series fixing ref count issues with blkcg (Dennis)
- Series improving BFQ heuristics (Paolo, et al)
- Series improving heuristics for the Kyber IO scheduler (Omar)
- Removal of dangerous bio_rewind_iter() API (Ming)
- Apply single queue IPI redirection logic to blk-mq (Ming)
- Set of fixes and improvements for bcache (Coly et al)
- Series closing a hotplug race with sysfs group attributes (Hannes)
- Set of patches for lightnvm:
- pblk trace support (Hans)
- SPDX license header update (Javier)
- Tons of refactoring patches to cleanly abstract the 1.2 and 2.0
specs behind a common core interface. (Javier, Matias)
- Enable pblk to use a common interface to retrieve chunk metadata
(Matias)
- Bug fixes (Various)
- Set of fixes and updates to the blk IO latency target (Josef)
- blk-mq queue number updates fixes (Jianchao)
- Convert a bunch of drivers from the old legacy IO interface to
blk-mq. This will conclude with the removal of the legacy IO
interface itself in 4.21, with the rest of the drivers (me, Omar)
- Removal of the DAC960 driver. The SCSI tree will introduce two
replacement drivers for this (Hannes)"
* tag 'for-4.20/block-20181021' of git://git.kernel.dk/linux-block: (204 commits)
block: setup bounce bio_sets properly
blkcg: reassociate bios when make_request() is called recursively
blkcg: fix edge case for blk_get_rl() under memory pressure
nvme-fabrics: move controller options matching to fabrics
nvme-rdma: always have a valid trsvcid
mtip32xx: fully switch to the generic DMA API
rsxx: switch to the generic DMA API
umem: switch to the generic DMA API
sx8: switch to the generic DMA API
sx8: remove dead IF_64BIT_DMA_IS_POSSIBLE code
skd: switch to the generic DMA API
ubd: remove use of blk_rq_map_sg
nvme-pci: remove duplicate check
drivers/block: Remove DAC960 driver
nvme-pci: fix hot removal during error handling
nvmet-fcloop: suppress a compiler warning
nvme-core: make implicit seed truncation explicit
nvmet-fc: fix kernel-doc headers
nvme-fc: rework the request initialization code
nvme-fc: introduce struct nvme_fcp_op_w_sgl
...
Stable bugfixes:
- Reset credit grant properly after a disconnect
Other bugfixes and cleanups:
- xprt_release_rqst_cong is called outside of transport_lock
- Create more MRs at a time and toss out old ones during recovery
- Various improvements to the RDMA connection and disconnection code:
- Improve naming of trace events, functions, and variables
- Add documenting comments
- Fix metrics and stats reporting
- Fix a tracepoint sparse warning
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlvHmcUACgkQ18tUv7Cl
QOv5Mg//ZIL92L6WqW2C+Tddr4UcPg1YphBEwGo3+TrswSRg3ncDiTQ8ycOOrmoy
7m5Oe5I1uEM0Ejqu0lh0uoxJlxRtMF0pwpnTA2Mx6bb4GLSTXjJQomKBhZ3v6owo
RQaQZTnAT+T5w1jZMuImdZ+c1zNNiFonSdPO7Er5jbdczvY6N7bg84goLoXLZkjk
cuYFbBl3DAyoUJ1usgiuCZLbMcEe0isJEtFU45dLkxxFkvNk+gO8UtA48qe0rFNg
8LQMHqhXDhHbdqLFpIRdvaanRpi8VjhCukE+Af9z/y0XNPYItWKTm0clkZ1bu/D4
/Q6gUnCU8KeSzlPqrT7nATO6L5sHlqlE9vSbJpWguvgBg9JbMDZquh0gejVqGr4t
1YbJyNh/anl5Xm56CIADEbQK3QocyDRwk9tQhlOUEwBu7rgQrU7NO+1CHgXRD94c
iNI892W9FZZQxKOkWnb3DtgpnmuQ4k9tLND/SSnqOllADpztDag+czOxtOHOxlK0
sVh4U82JtYtP9ubhKzFvTDlFv3rcjE86Nn55mAgCFk/XBDLF3kMjjZcS527YLkWY
OqDVeKin7nKv1ZfeV9msWbaKp4w2sNhgtROGQr4g1/FktRXS6b8XsTxn7anyVYzM
l8SJx66q8XFaWcp0Kwp55oOyhh16dhhel34lWi+wAlF79ypMGlE=
=SxSD
-----END PGP SIGNATURE-----
Merge tag 'nfs-rdma-for-4.20-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
NFS RDMA client updates for Linux 4.20
Stable bugfixes:
- Reset credit grant properly after a disconnect
Other bugfixes and cleanups:
- xprt_release_rqst_cong is called outside of transport_lock
- Create more MRs at a time and toss out old ones during recovery
- Various improvements to the RDMA connection and disconnection code:
- Improve naming of trace events, functions, and variables
- Add documenting comments
- Fix metrics and stats reporting
- Fix a tracepoint sparse warning
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Number of qgroup dirty extents is directly linked to the performance
overhead, so add a new trace event, trace_qgroup_num_dirty_extents(), to
record how many dirty extents is processed in
btrfs_qgroup_account_extents().
This will be pretty handy to analyze later balance performance
improvement.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are two members in struct btrfs_root which indicate root's
objectid: objectid and root_key.objectid.
They are both set to the same value in __setup_root():
static void __setup_root(struct btrfs_root *root,
struct btrfs_fs_info *fs_info,
u64 objectid)
{
...
root->objectid = objectid;
...
root->root_key.objectid = objecitd;
...
}
and not changed to other value after initialization.
grep in btrfs directory shows both are used in many places:
$ grep -rI "root->root_key.objectid" | wc -l
133
$ grep -rI "root->objectid" | wc -l
55
(4.17, inc. some noise)
It is confusing to have two similar variable names and it seems
that there is no rule about which should be used in a certain case.
Since ->root_key itself is needed for tree reloc tree, let's remove
'objecitd' member and unify code to use ->root_key.objectid in all places.
Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Conflicts were easy to resolve using immediate context mostly,
except the cls_u32.c one where I simply too the entire HEAD
chunk.
Signed-off-by: David S. Miller <davem@davemloft.net>
Trace events are useful for people who collect data from the
Ftrace outputs. There're people who analyse the relationship
of cpufreq, thermal and hwmon (power/voltage/current) using
the convenient and timestamped Ftrace outputs, while unlike
cpufreq and thermal subsystems the hwmon does not have trace
events supported yet.
So this patch adds initial trace events for the hwmon core.
To call hwmon_attr_base() for aligned attr index numbers, it
also moves the function upward.
Ftrace outputs:
...: hwmon_attr_show_string: index=2, attr_name=in2_label, val=VDD_5V
...: hwmon_attr_show: index=2, attr_name=in2_input, val=5112
...: hwmon_attr_show: index=2, attr_name=curr2_input, val=440
Note that the _attr_show and _attr_store functions are tied
to the _with_info API. So a hwmon driver requiring the trace
events feature should use _with_info API to register a hwmon
device.
Signed-off-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
-----BEGIN PGP SIGNATURE-----
iQIVAwUAW7vWy/u3V2unywtrAQJkeA//UTmUmv9bL8k38XvnnHQ82iTyIjvZ40md
9XbNagwjkxuDv2eqW1rTxk5+Po596Nb4s/cBvpyAABjCA+zoT0/fklK5PqsSKFla
ibAHlrBiv/1FlSUSTf4dvAfVVszRAItCP4OhKTU+0fkX3Hq75T7Lty1h92Yk/ijx
ccqSV9K/nB4WN9bbg+sDngu3RAVR3VJ5RhVSmrnE9xd3M9ihz2RHacJ6GWNQf5PV
j8cKuP3qY/2K245hb4sGn1i9OO6/WVVQhnojYA+jR8Vg4+8mY0Y++MLsW8ywlsuo
xlrDCVPBaR6YqjnlDoOXIXYZ8EbEsmy5FP51zfEZJqeKJcVPLavQyoku0xJ8fHgr
oK/3DBUkwgi0hiq4dC1CrmXZRSbBQqGTEJMA6AKh2ApXb9BkmhmqRGMx+2nqocwF
TjAQmCu0FJTAYdwJuqzv5SRFjocjrHGcYDpQ27CgNqSiiDlh8CZ1ej/mbV+o8pUQ
tOrpnRCLY8Nl55AgJbl2260mKcpcKQcRUwggKAOzqRjXwcy9q3KKm8H149qejJSl
AQihDkPVqBZrv5ODKR0z88e5cVF9ad+vBPd+2TrRVGtTWhm/q8N/0Hh0M2sYDSc6
vLklYfwI4U5U1/naJC9WCkW+ybj5ekVNHgW91ICvQVNG0TH2zfE7sUGMQn6lRZL8
jfpLTVbtKfI=
=q6E6
-----END PGP SIGNATURE-----
Merge tag 'rxrpc-fixes-20181008' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs
David Howells says:
====================
rxrpc: Fix packet reception code
Here are a set of patches that prepares for and fix problems in rxrpc's
package reception code. There serious problems are:
(A) There's a window between binding the socket and setting the data_ready
hook in which packets can find their way into the UDP socket's receive
queues.
(B) The skb_recv_udp() will return an error (and clear the error state) if
there was an error on the Tx side. rxrpc doesn't handle this.
(C) The rxrpc data_ready handler doesn't fully drain the UDP receive
queue.
(D) The rxrpc data_ready handler assumes it is called in a non-reentrant
state.
The second patch fixes (A) - (C); the third patch renders (B) and (C)
non-issues by using the recap_rcv hook instead of data_ready - and the
final patch fixes (D). That last is the most complex.
The preparatory patches are:
(1) Fix some places that are doing things in the wrong net namespace.
(2) Stop taking the rcu read lock as it's held by the IP input routine in
the call chain.
(3) Only end the Tx phase if *we* rotated the final packet out of the Tx
buffer.
(4) Don't assume that the call state won't change after dropping the
call_state lock.
(5) Only take receive window and MTU suze parameters from an ACK packet if
it's the latest ACK packet.
(6) Record connection-level abort information correctly.
(7) Fix a trace line.
And then there are three main patches - note that these are mixed in with
the preparatory patches somewhat:
(1) Fix the setup window (A), skb_recv_udp() error check (B) and packet
drainage (C).
(2) Switch to using the encap_rcv instead of data_ready to cut out the
effects of the UDP read queues and get the packets delivered directly.
(3) Add more locking into the various packet input paths to defend against
re-entrance (D).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the rxrpc_tx_packet trace line by storing the where parameter.
Fixes: 4764c0da69 ("rxrpc: Trace packet transmission")
Signed-off-by: David Howells <dhowells@redhat.com>
Ingo writes:
"scheduler fixes:
These fixes address a rather involved performance regression between
v4.17->v4.19 in the sched/numa auto-balancing code. Since distros
really need this fix we accelerated it to sched/urgent for a faster
upstream merge.
NUMA scheduling and balancing performance is now largely back to
v4.17 levels, without reintroducing the NUMA placement bugs that
v4.18 and v4.19 fixed.
Many thanks to Srikar Dronamraju, Mel Gorman and Jirka Hladky, for
reporting, testing, re-testing and solving this rather complex set of
bugs."
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/numa: Migrate pages to local nodes quicker early in the lifetime of a task
mm, sched/numa: Remove rate-limiting of automatic NUMA balancing migration
sched/numa: Avoid task migration for small NUMA improvement
mm/migrate: Use spin_trylock() while resetting rate limit
sched/numa: Limit the conditions where scan period is reset
sched/numa: Reset scan rate whenever task moves across nodes
sched/numa: Pass destination CPU as a parameter to migrate_task_rq
sched/numa: Stop multiple tasks from moving to the CPU at the same time
Minor conflict in net/core/rtnetlink.c, David Ahern's bug fix in 'net'
overlapped the renaming of a netlink attribute in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus recently observed that if we did not worry about the padding
member in struct siginfo it is only about 48 bytes, and 48 bytes is
much nicer than 128 bytes for allocating on the stack and copying
around in the kernel.
The obvious thing of only adding the padding when userspace is
including siginfo.h won't work as there are sigframe definitions in
the kernel that embed struct siginfo.
So split siginfo in two; kernel_siginfo and siginfo. Keeping the
traditional name for the userspace definition. While the version that
is used internally to the kernel and ultimately will not be padded to
128 bytes is called kernel_siginfo.
The definition of struct kernel_siginfo I have put in include/signal_types.h
A set of buildtime checks has been added to verify the two structures have
the same field offsets.
To make it easy to verify the change kernel_siginfo retains the same
size as siginfo. The reduction in size comes in a following change.
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
Clean up: Use a function name that is consistent with the RDMA core
API and with other consumers. Because this is a function that is
invoked from outside the rpcrdma.ko module, add an appropriate
documenting comment.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Clean up: Use a function name that is consistent with the RDMA core
API and with other consumers. Because this is a function that is
invoked from outside the rpcrdma.ko module, add an appropriate
documenting comment.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Clean up the names of trace events related to MRs so that it's
easy to enable these with a glob.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
When a memory operation fails, the MR's driver state might not match
its hardware state. The only reliable recourse is to dereg the MR.
This is done in ->ro_recover_mr, which then attempts to allocate a
fresh MR to replace the released MR.
Since commit e2ac236c0b ("xprtrdma: Allocate MRs on demand"),
xprtrdma dynamically allocates MRs. It can add more MRs whenever
they are needed.
That makes it possible to simply release an MR when a memory
operation fails, instead of "recovering" it. It will automatically
be replaced by the on-demand MR allocator.
This commit is a little larger than I wanted, but it replaces
->ro_recover_mr, rb_recovery_lock, rb_recovery_worker, and the
rb_stale_mrs list with a generic work queue.
Since MRs are no longer orphaned, the mrs_orphaned metric is no
longer used.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Rate limiting of page migrations due to automatic NUMA balancing was
introduced to mitigate the worst-case scenario of migrating at high
frequency due to false sharing or slowly ping-ponging between nodes.
Since then, a lot of effort was spent on correctly identifying these
pages and avoiding unnecessary migrations and the safety net may no longer
be required.
Jirka Hladky reported a regression in 4.17 due to a scheduler patch that
avoids spreading STREAM tasks wide prematurely. However, once the task
was properly placed, it delayed migrating the memory due to rate limiting.
Increasing the limit fixed the problem for him.
Currently, the limit is hard-coded and does not account for the real
capabilities of the hardware. Even if an estimate was attempted, it would
not properly account for the number of memory controllers and it could
not account for the amount of bandwidth used for normal accesses. Rather
than fudging, this patch simply eliminates the rate limiting.
However, Jirka reports that a STREAM configuration using multiple
processes achieved similar performance to 4.16. In local tests, this patch
improved performance of STREAM relative to the baseline but it is somewhat
machine-dependent. Most workloads show little or not performance difference
implying that there is not a heavily reliance on the throttling mechanism
and it is safe to remove.
STREAM on 2-socket machine
4.19.0-rc5 4.19.0-rc5
numab-v1r1 noratelimit-v1r1
MB/sec copy 43298.52 ( 0.00%) 44673.38 ( 3.18%)
MB/sec scale 30115.06 ( 0.00%) 31293.06 ( 3.91%)
MB/sec add 32825.12 ( 0.00%) 34883.62 ( 6.27%)
MB/sec triad 32549.52 ( 0.00%) 34906.60 ( 7.24%
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reviewed-by: Rik van Riel <riel@surriel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Jirka Hladky <jhladky@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linux-MM <linux-mm@kvack.org>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20181001100525.29789-2-mgorman@techsingularity.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Modify ext4_ext_remove_space() and the code it calls to correct the
reserved cluster count for pending reservations (delayed allocated
clusters shared with allocated blocks) when a block range is removed
from the extent tree. Pending reservations may be found for the clusters
at the ends of written or unwritten extents when a block range is removed.
If a physical cluster at the end of an extent is freed, it's necessary
to increment the reserved cluster count to maintain correct accounting
if the corresponding logical cluster is shared with at least one
delayed and unwritten extent as found in the extents status tree.
Add a new function, ext4_rereserve_cluster(), to reapply a reservation
on a delayed allocated cluster sharing blocks with a freed allocated
cluster. To avoid ENOSPC on reservation, a flag is applied to
ext4_free_blocks() to briefly defer updating the freeclusters counter
when an allocated cluster is freed. This prevents another thread
from allocating the freed block before the reservation can be reapplied.
Redefine the partial cluster object as a struct to carry more state
information and to clarify the code using it.
Adjust the conditional code structure in ext4_ext_remove_space to
reduce the indentation level in the main body of the code to improve
readability.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The code in ext4_da_map_blocks sometimes reserves space for more
delayed allocated clusters than it should, resulting in premature
ENOSPC, exceeded quota, and inaccurate free space reporting.
Fix this by checking for written and unwritten blocks shared in the
same cluster with the newly delayed allocated block. A cluster
reservation should not be made for a cluster for which physical space
has already been allocated.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ext4 contains a few functions that are used to search for delayed
extents or blocks in the extents status tree. Rather than duplicate
code to add new functions to search for extents with different status
values, such as written or a combination of delayed and unwritten,
generalize the existing code to search for caller-specified extents
status values. Also, move this code into extents_status.c where it
is better associated with the data structures it operates upon, and
where it can be more readily used to implement new extents status tree
functions that might want a broader scope for i_es_lock.
Three missing static specifiers in RFC version of patch reported and
fixed by Fengguang Wu <fengguang.wu@intel.com>.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Since we will want to introduce similar TCP state variables for the
transmission of requests, let's rename the existing ones to label
that they are for the receive side.
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Fix error distribution by immediately delivering the errors to all the
affected calls rather than deferring them to a worker thread. The problem
with the latter is that retries and things can happen in the meantime when we
want to stop that sooner.
To this end:
(1) Stop the error distributor from removing calls from the error_targets
list so that peer->lock isn't needed to synchronise against other adds
and removals.
(2) Require the peer's error_targets list to be accessed with RCU, thereby
avoiding the need to take peer->lock over distribution.
(3) Don't attempt to affect a call's state if it is already marked complete.
Signed-off-by: David Howells <dhowells@redhat.com>
When debugging Kyber, it's really useful to know what latencies we've
been having, how the domain depths have been adjusted, and if we've
actually been throttling. Add three tracepoints, kyber_latency,
kyber_adjust, and kyber_throttled, to record that.
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
After sk_state exposed, we can get in which state this retransmission
occurs. That could give us more detail for dignostic.
For example, if this retransmission occurs in SYN_SENT state, it may
also indicates that the syn packet may be dropped on the remote peer due
to syn backlog queue full and then we could check the remote peer.
BTW,SYNACK retransmission is traced in tcp_retransmit_synack tracepoint.
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are no more users of SEND_SIG_FORCED so it may be safely removed.
Remove the definition of SEND_SIG_FORCED, it's use in is_si_special,
it's use in TP_STORE_SIGINFO, and it's use in __send_signal as without
any users the uses of SEND_SIG_FORCED are now unncessary.
This makes the code simpler, easier to understand and use. Users of
signal sending functions now no longer need to ask themselves do I
need to use SEND_SIG_FORCED.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
include/trace/events/sched.h includes <linux/sched.h> (via
<linux/sched/numa_balancing.h>) and so knows about the TASK_* constants
used to interpret .prev_state. So instead of duplicating the magic
numbers make use of the defined macros to ease understanding the
mapping from state bits to letters which isn't completely intuitive for
an outsider.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: kernel@pengutronix.de
Link: http://lkml.kernel.org/r/20180905093636.24068-1-u.kleine-koenig@pengutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The ->rcu_qs_ctr counter was intended to allow providing a lightweight
report of a quiescent state to all RCU flavors. But now that there is
only one flavor of RCU in any one running kernel, there is no point in
having this feature. This commit therefore removes the ->rcu_qs_ctr
field from the rcu_dynticks structure and the ->rcu_qs_ctr_snap field
from the rcu_data structure. This results in the "rqc" option to the
rcu_fqs trace event no longer being used, so this commit also removes the
"rqc" description from the header comment.
While in the neighborhood, this commit also causes the forward-progress
request .rcu_need_heavy_qs be set one jiffies_till_sched_qs interval
later in the grace period than the first setting of .rcu_urgent_qs.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Because rcu_barrier() is a one-line wrapper function for _rcu_barrier()
and because nothing else calls _rcu_barrier(), this commit inlines
_rcu_barrier() into rcu_barrier().
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Pull cgroup updates from Tejun Heo:
"Just one commit from Steven to take out spin lock from trace event
handlers"
* 'for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
cgroup/tracing: Move taking of spin lock out of trace event handlers
- Restructure of lockdep and latency tracers
This is the biggest change. Joel Fernandes restructured the hooks
from irqs and preemption disabling and enabling. He got rid of
a lot of the preprocessor #ifdef mess that they caused.
He turned both lockdep and the latency tracers to use trace events
inserted in the preempt/irqs disabling paths. But unfortunately,
these started to cause issues in corner cases. Thus, parts of the
code was reverted back to where lockde and the latency tracers
just get called directly (without using the trace events).
But because the original change cleaned up the code very nicely
we kept that, as well as the trace events for preempt and irqs
disabling, but they are limited to not being called in NMIs.
- Have trace events use SRCU for "rcu idle" calls. This was required
for the preempt/irqs off trace events. But it also had to not
allow them to be called in NMI context. Waiting till Paul makes
an NMI safe SRCU API.
- New notrace SRCU API to allow trace events to use SRCU.
- Addition of mcount-nop option support
- SPDX headers replacing GPL templates.
- Various other fixes and clean ups.
- Some fixes are marked for stable, but were not fully tested
before the merge window opened.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCW3ruhRQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qiM7AP47NhYdSnCFCRUJfrt6PovXmQtuCHt3
c3QMoGGdvzh9YAEAqcSXwh7uLhpHUp1LjMAPkXdZVwNddf4zJQ1zyxQ+EAU=
=vgEr
-----END PGP SIGNATURE-----
Merge tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
- Restructure of lockdep and latency tracers
This is the biggest change. Joel Fernandes restructured the hooks
from irqs and preemption disabling and enabling. He got rid of a lot
of the preprocessor #ifdef mess that they caused.
He turned both lockdep and the latency tracers to use trace events
inserted in the preempt/irqs disabling paths. But unfortunately,
these started to cause issues in corner cases. Thus, parts of the
code was reverted back to where lockdep and the latency tracers just
get called directly (without using the trace events). But because the
original change cleaned up the code very nicely we kept that, as well
as the trace events for preempt and irqs disabling, but they are
limited to not being called in NMIs.
- Have trace events use SRCU for "rcu idle" calls. This was required
for the preempt/irqs off trace events. But it also had to not allow
them to be called in NMI context. Waiting till Paul makes an NMI safe
SRCU API.
- New notrace SRCU API to allow trace events to use SRCU.
- Addition of mcount-nop option support
- SPDX headers replacing GPL templates.
- Various other fixes and clean ups.
- Some fixes are marked for stable, but were not fully tested before
the merge window opened.
* tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
tracing: Fix SPDX format headers to use C++ style comments
tracing: Add SPDX License format tags to tracing files
tracing: Add SPDX License format to bpf_trace.c
blktrace: Add SPDX License format header
s390/ftrace: Add -mfentry and -mnop-mcount support
tracing: Add -mcount-nop option support
tracing: Avoid calling cc-option -mrecord-mcount for every Makefile
tracing: Handle CC_FLAGS_FTRACE more accurately
Uprobe: Additional argument arch_uprobe to uprobe_write_opcode()
Uprobes: Simplify uprobe_register() body
tracepoints: Free early tracepoints after RCU is initialized
uprobes: Use synchronize_rcu() not synchronize_sched()
tracing: Fix synchronizing to event changes with tracepoint_synchronize_unregister()
ftrace: Remove unused pointer ftrace_swapper_pid
tracing: More reverting of "tracing: Centralize preemptirq tracepoints and unify their usage"
tracing/irqsoff: Handle preempt_count for different configs
tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage"
tracing: irqsoff: Account for additional preempt_disable
trace: Use rcu_dereference_raw for hooks from trace-event subsystem
tracing/kprobes: Fix within_notrace_func() to check only notrace functions
...
Pull networking fixes from David Miller:
1) Fix races in IPVS, from Tan Hu.
2) Missing unbind in matchall classifier, from Hangbin Liu.
3) Missing act_ife action release, from Vlad Buslov.
4) Cure lockdep splats in ila, from Cong Wang.
5) veth queue leak on link delete, from Toshiaki Makita.
6) Disable isdn's IIOCDBGVAR ioctl, it exposes kernel addresses. From
Kees Cook.
7) RCU usage fixup in XDP, from Tariq Toukan.
8) Two TCP ULP fixes from Daniel Borkmann.
9) r8169 needs REALTEK_PHY as a Kconfig dependency, from Heiner
Kallweit.
10) Always take tcf_lock with BH disabled, otherwise we can deadlock
with rate estimator code paths. From Vlad Buslov.
11) Don't use MSI-X on RTL8106e r8169 chips, they don't resume properly.
From Jian-Hong Pan.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
ip6_vti: fix creating fallback tunnel device for vti6
ip_vti: fix a null pointer deferrence when create vti fallback tunnel
r8169: don't use MSI-X on RTL8106e
net: lan743x_ptp: convert to ktime_get_clocktai_ts64
net: sched: always disable bh when taking tcf_lock
ip6_vti: simplify stats handling in vti6_xmit
bpf: fix redirect to map under tail calls
r8169: add missing Kconfig dependency
tools/bpf: fix bpf selftest test_cgroup_storage failure
bpf, sockmap: fix sock_map_ctx_update_elem race with exist/noexist
bpf, sockmap: fix map elem deletion race with smap_stop_sock
bpf, sockmap: fix leakage of smap_psock_map_entry
tcp, ulp: fix leftover icsk_ulp_ops preventing sock from reattach
tcp, ulp: add alias for all ulp modules
bpf: fix a rcu usage warning in bpf_prog_array_copy_core()
samples/bpf: all XDP samples should unload xdp/bpf prog on SIGTERM
net/xdp: Fix suspicious RCU usage warning
net/mlx5e: Delete unneeded function argument
Documentation: networking: ti-cpsw: correct cbs parameters for Eth1 100Mb
isdn: Disable IIOCDBGVAR
...
Here is the bit set of char/misc drivers for 4.19-rc1
There is a lot here, much more than normal, seems like everyone is
writing new driver subsystems these days... Anyway, major things here
are:
- new FSI driver subsystem, yet-another-powerpc low-level
hardware bus
- gnss, finally an in-kernel GPS subsystem to try to tame all of
the crazy out-of-tree drivers that have been floating around
for years, combined with some really hacky userspace
implementations. This is only for GNSS receivers, but you
have to start somewhere, and this is great to see.
Other than that, there are new slimbus drivers, new coresight drivers,
new fpga drivers, and loads of DT bindings for all of these and existing
drivers.
Full details of everything is in the shortlog.
All of these have been in linux-next for a while with no reported
issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCW3g7ew8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ykfBgCeOG0RkSI92XVZe0hs/QYFW9kk8JYAnRBf3Qpm
cvW7a+McOoKz/MGmEKsi
=TNfn
-----END PGP SIGNATURE-----
Merge tag 'char-misc-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver updates from Greg KH:
"Here is the bit set of char/misc drivers for 4.19-rc1
There is a lot here, much more than normal, seems like everyone is
writing new driver subsystems these days... Anyway, major things here
are:
- new FSI driver subsystem, yet-another-powerpc low-level hardware
bus
- gnss, finally an in-kernel GPS subsystem to try to tame all of the
crazy out-of-tree drivers that have been floating around for years,
combined with some really hacky userspace implementations. This is
only for GNSS receivers, but you have to start somewhere, and this
is great to see.
Other than that, there are new slimbus drivers, new coresight drivers,
new fpga drivers, and loads of DT bindings for all of these and
existing drivers.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (255 commits)
android: binder: Rate-limit debug and userspace triggered err msgs
fsi: sbefifo: Bump max command length
fsi: scom: Fix NULL dereference
misc: mic: SCIF Fix scif_get_new_port() error handling
misc: cxl: changed asterisk position
genwqe: card_base: Use true and false for boolean values
misc: eeprom: assignment outside the if statement
uio: potential double frees if __uio_register_device() fails
eeprom: idt_89hpesx: clean up an error pointer vs NULL inconsistency
misc: ti-st: Fix memory leak in the error path of probe()
android: binder: Show extra_buffers_size in trace
firmware: vpd: Fix section enabled flag on vpd_section_destroy
platform: goldfish: Retire pdev_bus
goldfish: Use dedicated macros instead of manual bit shifting
goldfish: Add missing includes to goldfish.h
mux: adgs1408: new driver for Analog Devices ADGS1408/1409 mux
dt-bindings: mux: add adi,adgs1408
Drivers: hv: vmbus: Cleanup synic memory free path
Drivers: hv: vmbus: Remove use of slow_virt_to_phys()
Drivers: hv: vmbus: Reset the channel callback in vmbus_onoffer_rescind()
...
Commits 109980b894 ("bpf: don't select potentially stale ri->map
from buggy xdp progs") and 7c30013133 ("bpf: fix ri->map_owner
pointer on bpf_prog_realloc") tried to mitigate that buggy programs
using bpf_redirect_map() helper call do not leave stale maps behind.
Idea was to add a map_owner cookie into the per CPU struct redirect_info
which was set to prog->aux by the prog making the helper call as a
proof that the map is not stale since the prog is implicitly holding
a reference to it. This owner cookie could later on get compared with
the program calling into BPF whether they match and therefore the
redirect could proceed with processing the map safely.
In (obvious) hindsight, this approach breaks down when tail calls are
involved since the original caller's prog->aux pointer does not have
to match the one from one of the progs out of the tail call chain,
and therefore the xdp buffer will be dropped instead of redirected.
A way around that would be to fix the issue differently (which also
allows to remove related work in fast path at the same time): once
the life-time of a redirect map has come to its end we use it's map
free callback where we need to wait on synchronize_rcu() for current
outstanding xdp buffers and remove such a map pointer from the
redirect info if found to be present. At that time no program is
using this map anymore so we simply invalidate the map pointers to
NULL iff they previously pointed to that instance while making sure
that the redirect path only reads out the map once.
Fixes: 97f91a7cf0 ("bpf: add bpf_redirect_map helper routine")
Fixes: 109980b894 ("bpf: don't select potentially stale ri->map from buggy xdp progs")
Reported-by: Sebastiano Miano <sebastiano.miano@polito.it>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Pull networking updates from David Miller:
"Highlights:
- Gustavo A. R. Silva keeps working on the implicit switch fallthru
changes.
- Support 802.11ax High-Efficiency wireless in cfg80211 et al, From
Luca Coelho.
- Re-enable ASPM in r8169, from Kai-Heng Feng.
- Add virtual XFRM interfaces, which avoids all of the limitations of
existing IPSEC tunnels. From Steffen Klassert.
- Convert GRO over to use a hash table, so that when we have many
flows active we don't traverse a long list during accumluation.
- Many new self tests for routing, TC, tunnels, etc. Too many
contributors to mention them all, but I'm really happy to keep
seeing this stuff.
- Hardware timestamping support for dpaa_eth/fsl-fman from Yangbo Lu.
- Lots of cleanups and fixes in L2TP code from Guillaume Nault.
- Add IPSEC offload support to netdevsim, from Shannon Nelson.
- Add support for slotting with non-uniform distribution to netem
packet scheduler, from Yousuk Seung.
- Add UDP GSO support to mlx5e, from Boris Pismenny.
- Support offloading of Team LAG in NFP, from John Hurley.
- Allow to configure TX queue selection based upon RX queue, from
Amritha Nambiar.
- Support ethtool ring size configuration in aquantia, from Anton
Mikaev.
- Support DSCP and flowlabel per-transport in SCTP, from Xin Long.
- Support list based batching and stack traversal of SKBs, this is
very exciting work. From Edward Cree.
- Busyloop optimizations in vhost_net, from Toshiaki Makita.
- Introduce the ETF qdisc, which allows time based transmissions. IGB
can offload this in hardware. From Vinicius Costa Gomes.
- Add parameter support to devlink, from Moshe Shemesh.
- Several multiplication and division optimizations for BPF JIT in
nfp driver, from Jiong Wang.
- Lots of prepatory work to make more of the packet scheduler layer
lockless, when possible, from Vlad Buslov.
- Add ACK filter and NAT awareness to sch_cake packet scheduler, from
Toke Høiland-Jørgensen.
- Support regions and region snapshots in devlink, from Alex Vesker.
- Allow to attach XDP programs to both HW and SW at the same time on
a given device, with initial support in nfp. From Jakub Kicinski.
- Add TLS RX offload and support in mlx5, from Ilya Lesokhin.
- Use PHYLIB in r8169 driver, from Heiner Kallweit.
- All sorts of changes to support Spectrum 2 in mlxsw driver, from
Ido Schimmel.
- PTP support in mv88e6xxx DSA driver, from Andrew Lunn.
- Make TCP_USER_TIMEOUT socket option more accurate, from Jon
Maxwell.
- Support for templates in packet scheduler classifier, from Jiri
Pirko.
- IPV6 support in RDS, from Ka-Cheong Poon.
- Native tproxy support in nf_tables, from Máté Eckl.
- Maintain IP fragment queue in an rbtree, but optimize properly for
in-order frags. From Peter Oskolkov.
- Improvde handling of ACKs on hole repairs, from Yuchung Cheng"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1996 commits)
bpf: test: fix spelling mistake "REUSEEPORT" -> "REUSEPORT"
hv/netvsc: Fix NULL dereference at single queue mode fallback
net: filter: mark expected switch fall-through
xen-netfront: fix warn message as irq device name has '/'
cxgb4: Add new T5 PCI device ids 0x50af and 0x50b0
net: dsa: mv88e6xxx: missing unlock on error path
rds: fix building with IPV6=m
inet/connection_sock: prefer _THIS_IP_ to current_text_addr
net: dsa: mv88e6xxx: bitwise vs logical bug
net: sock_diag: Fix spectre v1 gadget in __sock_diag_cmd()
ieee802154: hwsim: using right kind of iteration
net: hns3: Add vlan filter setting by ethtool command -K
net: hns3: Set tx ring' tc info when netdev is up
net: hns3: Remove tx ring BD len register in hns3_enet
net: hns3: Fix desc num set to default when setting channel
net: hns3: Fix for phy link issue when using marvell phy driver
net: hns3: Fix for information of phydev lost problem when down/up
net: hns3: Fix for command format parsing error in hclge_is_all_function_id_zero
net: hns3: Add support for serdes loopback selftest
bnxt_en: take coredump_record structure off stack
...
It's been busy summer weeks and hence lots of changes, partly for a
few new drivers and partly for a wide range of fixes.
Here are highlights:
ALSA Core:
- Fix rawmidi buffer management, code cleanup / refactoring
- Fix the SG-buffer page handling with incorrect fallback size
- Fix the stall at virmidi trigger callback with a large buffer;
also offloading and code-refactoring along with it
- Various ALSA sequencer code cleanups
ASoC:
- Deploy the standard snd_pcm_stop_xrun() helper in several drivers
- Support for providing name prefixes to generic component nodes
- Quite a few fixes for DPCM as it gains a bit wider use and more
robust testing
- Generalization of the DIO2125 support to a simple amplifier driver
- Accessory detection support for the audio graph card
- DT support for PXA AC'97 devices
- Quirks for a number of new x86 systems
- Support for AM Logic Meson, Everest ES7154, Intel systems with
RT5682, Qualcomm QDSP6 and WCD9335, Realtek RT5682 and TI TAS5707
HD-audio:
- Code refactoring in HD-audio ext codec codes to drop own classes;
preliminary works for the upcoming legacy codec support
- Generalized DRM audio component for the upcoming radeon / amdgpu
support
- Unification of mic mute-LED and GPIO support for various codecs
- Further improvement of CA0132 codec support including Recon3D
- Proper vga_switcheroo handling for AMD i-GPU
- Update of model list in documentation
- Fixups for another HP Spectre x360, Conexant codecs, power-save
blacklist update
USB-audio:
- Fix the invalid sample rate setup with external clock
- Support of UAC3 selector units and processing units
- Basic UAC3 power-domain support
- Support for Encore mDSD and Thesycon-based DSD devices
- Preparation for future complete callback changes
Firewire:
- Add support for MOTU Traveler
Misc:
- The endianess notation fixes in various drivers
- Add fall-through comment in lots of drivers
- Various sparse warning fixes, e.g. about PCM format types
-----BEGIN PGP SIGNATURE-----
iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAltxhXUOHHRpd2FpQHN1
c2UuZGUACgkQLtJE4w1nLE9BPxAAnJyKM2IcfVKanjFZQmM4w1gEM3nt+EBvbOF5
155Mq5ELNLxwwCav0eyoeTVD+mB4UO1y9+64SK73dUzBvj7llILs8s5VNMg7KYn6
MnYMo0UIDlvQ/ZwJzzpU04QZVOIHa5HK9XG+u+Fycr8YOCdhGD6m/zGiBStd/xGd
Bvgw2eHF10l+zqN+Vf+1P2/ENRyNxLYpJYYC02zl0nfXP29ZY+CjicDoRvD8/97X
Pe5Jcfj/4oJZlN0MMXfLLP0vaWbUyogG3a4mzVRC+wHG2FZ5hGfFb92mfT8Yce3H
dq9eaih6zMMiDuP4ipClMv/0t64cA8HD+Du/Enq9Jc/g41QAU1JFzj4qi1Ga7S2w
F80uWCedwZ5FLXYAAK2kIunIaaD5phD+8DegzchiVzL6eg7BLmi/Rsfg9VkuC1Ir
ZZERtT07XCzwXke0IAQQ1yhTE/uMxqntCJlZ0FoohvIABgH0jvrpp/cdDYFdC3it
TaNk9REAuCVjr2ket1Jrw6pKNgtry7cDkKFK5d8S5HTNFEL+sWwz2NcSdPNRIIFT
aqeUWt/H1P5G25if/636UJtf+EtlRPqC2Eng8OpY8hitb2vB3trjY25T4a5x5FW5
S4oTyVWmxa4Xxj4QMMhpo7jxxwsux8J1fghDCpEqekiwt72CyVP2UN/cU8HJL7kN
IWqnZgg=
=eTIC
-----END PGP SIGNATURE-----
Merge tag 'sound-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound updates from Takashi Iwai:
"It's been busy summer weeks and hence lots of changes, partly for a
few new drivers and partly for a wide range of fixes.
Here are highlights:
ALSA Core:
- Fix rawmidi buffer management, code cleanup / refactoring
- Fix the SG-buffer page handling with incorrect fallback size
- Fix the stall at virmidi trigger callback with a large buffer; also
offloading and code-refactoring along with it
- Various ALSA sequencer code cleanups
ASoC:
- Deploy the standard snd_pcm_stop_xrun() helper in several drivers
- Support for providing name prefixes to generic component nodes
- Quite a few fixes for DPCM as it gains a bit wider use and more
robust testing
- Generalization of the DIO2125 support to a simple amplifier driver
- Accessory detection support for the audio graph card
- DT support for PXA AC'97 devices
- Quirks for a number of new x86 systems
- Support for AM Logic Meson, Everest ES7154, Intel systems with
RT5682, Qualcomm QDSP6 and WCD9335, Realtek RT5682 and TI TAS5707
HD-audio:
- Code refactoring in HD-audio ext codec codes to drop own classes;
preliminary works for the upcoming legacy codec support
- Generalized DRM audio component for the upcoming radeon / amdgpu
support
- Unification of mic mute-LED and GPIO support for various codecs
- Further improvement of CA0132 codec support including Recon3D
- Proper vga_switcheroo handling for AMD i-GPU
- Update of model list in documentation
- Fixups for another HP Spectre x360, Conexant codecs, power-save
blacklist update
USB-audio:
- Fix the invalid sample rate setup with external clock
- Support of UAC3 selector units and processing units
- Basic UAC3 power-domain support
- Support for Encore mDSD and Thesycon-based DSD devices
- Preparation for future complete callback changes
Firewire:
- Add support for MOTU Traveler
Misc:
- The endianess notation fixes in various drivers
- Add fall-through comment in lots of drivers
- Various sparse warning fixes, e.g. about PCM format types"
* tag 'sound-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (529 commits)
ASoC: adav80x: mark expected switch fall-through
ASoC: da7219: Add delays to capture path to remove DC offset noise
ALSA: usb-audio: Mark expected switch fall-through
ALSA: mixart: Mark expected switch fall-through
ALSA: opl3: Mark expected switch fall-through
ALSA: hda/ca0132 - Add exit commands for Recon3D
ALSA: hda/ca0132 - Change mixer controls for Recon3D
ALSA: hda/ca0132 - Add Recon3D input and output select commands
ALSA: hda/ca0132 - Add DSP setup defaults for Recon3D
ALSA: hda/ca0132 - Add Recon3D startup functions and setup
ALSA: hda/ca0132 - Add bool variable to enable/disable pci region2 mmio
ALSA: hda/ca0132 - Add Recon3D pincfg
ALSA: hda/ca0132 - Add quirk ID and enum for Recon3D
ALSA: hda/ca0132 - Add alt_functions unsolicited response
ALSA: hda/ca0132 - Clean up ca0132_init function.
ALSA: hda/ca0132 - Create mmio gpio function to make code clearer
ASoC: wm_adsp: Make DSP name configurable by codec driver
ASoC: wm_adsp: Declare firmware controls from codec driver
ASoC: max98373: Added software reset register to readable registers
ASoC: wm_adsp: Correct DSP pointer for preloader control
...
- Add a new framework for CPU idle time injection (Daniel Lezcano).
- Add AVS support to the armada-37xx cpufreq driver (Gregory CLEMENT).
- Add support for current CPU frequency reporting to the ACPI CPPC
cpufreq driver (George Cherian).
- Rework the cooling device registration in the imx6q/thermal
driver (Bastian Stender).
- Make the pcc-cpufreq driver refuse to work with dynamic
scaling governors on systems with many CPUs to avoid
scalability issues with it (Rafael Wysocki).
- Fix the intel_pstate driver to report different maximum CPU
frequencies on systems where they really are different and to
ignore the turbo active ratio if hardware-managend P-states (HWP)
are in use; make it use the match_string() helper (Xie Yisheng,
Srinivas Pandruvada).
- Fix a minor deferred probe issue in the qcom-kryo cpufreq
driver (Niklas Cassel).
- Add a tracepoint for the tracking of frequency limits changes
(from Andriod) to the cpufreq core (Ruchi Kandoi).
- Fix a circular lock dependency between CPU hotplug and sysfs
locking in the cpufreq core reported by lockdep (Waiman Long).
- Avoid excessive error reports on driver registration failures
in the ARM cpuidle driver (Sudeep Holla).
- Add a new device links flag to the driver core to make links go
away automatically on supplier driver removal (Vivek Gautam).
- Eliminate potential race condition between system-wide power
management transitions and system shutdown (Pingfan Liu).
- Add a quirk to save NVS memory on system suspend for the ASUS
1025C laptop (Willy Tarreau).
- Make more systems use suspend-to-idle (instead of ACPI S3) by
default (Tristian Celestin).
- Get rid of stack VLA usage in the low-level hibernation code on
64-bit x86 (Kees Cook).
- Fix error handling in the hibernation core and mark an expected
fall-through switch in it (Chengguang Xu, Gustavo Silva).
- Extend the generic power domains (genpd) framework to support
attaching a device to a power domain by name (Ulf Hansson).
- Fix device reference counting and user limits initialization in
the devfreq core (Arvind Yadav, Matthias Kaehlcke).
- Fix a few issues in the rk3399_dmc devfreq driver and improve its
documentation (Enric Balletbo i Serra, Lin Huang, Nick Milner).
- Drop a redundant error message from the exynos-ppmu devfreq driver
(Markus Elfring).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJbcqOqAAoJEILEb/54YlRxOxMP/2ZFvnXU0pey/VX/+TelLMS7
/ROVGQ+s75QP1c9P/3BjvnXc0dsMRLRFPog+7wyoG/2DbEIV25COyAYsmSE0TRni
XUaZO6YAx4/e3pm2AfamYbLCPvjw85eucHg5QJQ4b1mSVRNJOsNv+fUo6lmxwvnm
j9kHvfttFeIhoa/3wa7hbhPKLln46atnpVSxCIceY7L5EFNhkKBvQt6B5yx9geb9
QMY6ohgkyN+bnK9QySXX+trcWpzx1uGX0apI07NkX7n9QGFdU4lCW8lsAf8jMC3g
PPValTsUQsdRONUJJsrgqBioq4tvtgQWibyS2tfRrOGXYvHpJNpGmHVplfsrf/SE
cvlsciR47YbmrXZuqg/r8hql+qefNN16/rnZIZ9VnbcG806VBy2z8IzI5wcdWR7p
vzxhbCqVqOHcEdEwRwvuM2io67MWvkGtKsbCP+33DBh8SubpsECpKN4nIDboa3SE
CJ15RUqXnF6enmmfCKOoHZeu7iXWDz6Pi71XmRzaj9DqbITVV281IerqLgV3rbal
BVa53+202iD0IP+2b7KedGe/5ALlI97ffN0gB+L/eB832853DKSZQKzcvvpRhEN7
Iv2crnUwuQED9ns8P7hzp1Bk9CFCAOLW8UM43YwZRPWnmdeSsPJusJ5lzkAf7bss
wfsFoUE3RaY4msnuHyCh
=kv2M
-----END PGP SIGNATURE-----
Merge tag 'pm-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki:
"These add a new framework for CPU idle time injection, to be used by
all of the idle injection code in the kernel in the future, fix some
issues and add a number of relatively small extensions in multiple
places.
Specifics:
- Add a new framework for CPU idle time injection (Daniel Lezcano).
- Add AVS support to the armada-37xx cpufreq driver (Gregory
CLEMENT).
- Add support for current CPU frequency reporting to the ACPI CPPC
cpufreq driver (George Cherian).
- Rework the cooling device registration in the imx6q/thermal driver
(Bastian Stender).
- Make the pcc-cpufreq driver refuse to work with dynamic scaling
governors on systems with many CPUs to avoid scalability issues
with it (Rafael Wysocki).
- Fix the intel_pstate driver to report different maximum CPU
frequencies on systems where they really are different and to
ignore the turbo active ratio if hardware-managend P-states (HWP)
are in use; make it use the match_string() helper (Xie Yisheng,
Srinivas Pandruvada).
- Fix a minor deferred probe issue in the qcom-kryo cpufreq driver
(Niklas Cassel).
- Add a tracepoint for the tracking of frequency limits changes (from
Andriod) to the cpufreq core (Ruchi Kandoi).
- Fix a circular lock dependency between CPU hotplug and sysfs
locking in the cpufreq core reported by lockdep (Waiman Long).
- Avoid excessive error reports on driver registration failures in
the ARM cpuidle driver (Sudeep Holla).
- Add a new device links flag to the driver core to make links go
away automatically on supplier driver removal (Vivek Gautam).
- Eliminate potential race condition between system-wide power
management transitions and system shutdown (Pingfan Liu).
- Add a quirk to save NVS memory on system suspend for the ASUS 1025C
laptop (Willy Tarreau).
- Make more systems use suspend-to-idle (instead of ACPI S3) by
default (Tristian Celestin).
- Get rid of stack VLA usage in the low-level hibernation code on
64-bit x86 (Kees Cook).
- Fix error handling in the hibernation core and mark an expected
fall-through switch in it (Chengguang Xu, Gustavo Silva).
- Extend the generic power domains (genpd) framework to support
attaching a device to a power domain by name (Ulf Hansson).
- Fix device reference counting and user limits initialization in the
devfreq core (Arvind Yadav, Matthias Kaehlcke).
- Fix a few issues in the rk3399_dmc devfreq driver and improve its
documentation (Enric Balletbo i Serra, Lin Huang, Nick Milner).
- Drop a redundant error message from the exynos-ppmu devfreq driver
(Markus Elfring)"
* tag 'pm-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (35 commits)
PM / reboot: Eliminate race between reboot and suspend
PM / hibernate: Mark expected switch fall-through
cpufreq: intel_pstate: Ignore turbo active ratio in HWP
cpufreq: Fix a circular lock dependency problem
cpu/hotplug: Add a cpus_read_trylock() function
x86/power/hibernate_64: Remove VLA usage
cpufreq: trace frequency limits change
cpufreq: intel_pstate: Show different max frequency with turbo 3 and HWP
cpufreq: pcc-cpufreq: Disable dynamic scaling on many-CPU systems
cpufreq: qcom-kryo: Silently error out on EPROBE_DEFER
cpufreq / CPPC: Add cpuinfo_cur_freq support for CPPC
cpufreq: armada-37xx: Add AVS support
dt-bindings: marvell: Add documentation for the Armada 3700 AVS binding
PM / devfreq: rk3399_dmc: Fix duplicated opp table on reload.
PM / devfreq: Init user limits from OPP limits, not viceversa
PM / devfreq: rk3399_dmc: fix spelling mistakes.
PM / devfreq: rk3399_dmc: do not print error when get supply and clk defer.
dt-bindings: devfreq: rk3399_dmc: move interrupts to be optional.
PM / devfreq: rk3399_dmc: remove wait for dcf irq event.
dt-bindings: clock: add rk3399 DDR3 standard speed bins.
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAltxe7QACgkQxWXV+ddt
WDswMA//QlRO+Ln5CH+RlT4fyf1RQUQZblWss2zxrmlo1GRI3Ljf2DNsBE3rD7P4
NSiXfHmgkdjcQP6poPLJwHxwkNd4NFXglYg64wWO10RjHGhKglmH6ztU88wsPfr2
2RZv271/NvYIEkEi6kdyy8ilKeWMshOfyj3+PaeapQn67uJfimyiUvDgUgbvwH3c
yj0nVRLP1C7snNj4Atti/rjXMhG+m1UWfjRkZsmqlBp52k2UAcrtiwQK+DS5b9mL
aWLSaGmIcJtSMkNJPQBST9GTWbJfKTpceoCzkT0o3irvQpN2e2flAJ4ireL8q4mN
MvqJ7giPBFHNDcHEzN6VERvsaA1Rx9Vq20ieQl8JAMd4p/bi5ehN3ww+9vau5zCw
Pc8WeKEILKrLYEAgHOnUO1wxHw994Iv5CA26roTQ0HNXQJjyEZ4m40Ch6LzmfKPm
WKcHX14Uw22GKaFEXHTOpRZ0U0d1cMTcn5zaAajGsB9LwcaiLM+OiFSPtDkwUOB9
QGJHklZVXAD1IH9HFPuq85uUtXTLXbxsw1g8phEJGbmaVxxCOAUAXwEk3qxuZNbz
CHL3G5+l3JEXxfoJSbDW60kr8xic7teqQDszqqP2qlqtP15ty2xc9d5Q8MZajSTZ
H1z9+0gfjYYHrGuAp69MtCbdQhhDSqLyivjJJm0HBaKfVNGW2Xg=
=jBaz
-----END PGP SIGNATURE-----
Merge tag 'for-4.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"Mostly fixes and cleanups, nothing big, though the notable thing is
the inserted/deleted lines delta -1124.
User visible changes:
- allow defrag on opened read-only files that have rw permissions;
similar to what dedupe will allow on such files
Core changes:
- tree checker improvements, reported by fuzzing:
* more checks for: block group items, essential trees
* chunk type validation
* mount time cross-checks that physical and logical chunks match
* switch more error codes to EUCLEAN aka EFSCORRUPTED
Fixes:
- fsync corner case fixes
- fix send failure when root has deleted files still open
- send, fix incorrect file layout after hole punching beyond eof
- fix races between mount and deice scan ioctl, found by fuzzing
- fix deadlock when delayed iput is called from writeback on the same
inode; rare but has been observed in practice, also removes code
- fix pinned byte accounting, using the right percpu helpers; this
should avoid some write IO inefficiency during low space conditions
- don't remove block group that still has pinned bytes
- reset on-disk device stats value after replace, otherwise this
would report stale values for the new device
Cleanups:
- time64_t/timespec64 cleanups
- remove remaining dead code in scrub handling NOCOW extents after
disabling it in previous cycle
- simplify fsync regarding ordered extents logic and remove all the
related code
- remove redundant arguments in order to reduce stack space
consumption
- remove support for V0 type of extents, not in use since 2.6.30
- remove several unused structure members
- fewer indirect function calls by inlining some callbacks
- qgroup rescan timing fixes
- vfs: iget cleanups"
* tag 'for-4.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (182 commits)
btrfs: revert fs_devices state on error of btrfs_init_new_device
btrfs: Exit gracefully when chunk map cannot be inserted to the tree
btrfs: Introduce mount time chunk <-> dev extent mapping check
btrfs: Verify that every chunk has corresponding block group at mount time
btrfs: Check that each block group has corresponding chunk at mount time
Btrfs: send, fix incorrect file layout after hole punching beyond eof
btrfs: Use wrapper macro for rcu string to remove duplicate code
btrfs: simplify btrfs_iget
btrfs: lift make_bad_inode into btrfs_iget
btrfs: simplify IS_ERR/PTR_ERR checks
btrfs: btrfs_iget never returns an is_bad_inode inode
btrfs: replace: Reset on-disk dev stats value after replace
btrfs: extent-tree: Remove unused __btrfs_free_block_rsv
btrfs: backref: Use ERR_CAST to return error code
btrfs: Remove redundant btrfs_release_path from btrfs_unlink_subvol
btrfs: Remove root parameter from btrfs_unlink_subvol
btrfs: Remove fs_info from btrfs_add_root_ref
btrfs: Remove fs_info from btrfs_del_root_ref
btrfs: Remove fs_info from btrfs_del_root
btrfs: Remove fs_info from btrfs_delete_delayed_dir_index
...
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbcDltAAoJEAAOaEEZVoIV9YUP/ioYw3+kIXwRa05ec8Twbn6m
O+1HOX9UAuDWX+97P2kCJSqi9a/TkpfLQxoGxZowDv97ZkVeHdzVaJVN/0sbH1uP
Oj86b2Y5zGyrLk3D4c5VcyHkvogr16DUvhYeRtcjPyufSaKqefftCeetTntRvr4Z
rZ8HLKYxa4AZlP/FkgVZ/S6PmvG7tH8rVLMFCfY+rASbSU4gBxmrS0nB2UgU4ycb
b2SVZo4Zxycw0KiP14T5AgN4puid7VEofFezBpMKjNPjCUk1wJ3mtUyLeS8jRUEx
M6qnl7DU4BWPRtGxiXNebN85M8iaJnlaeQglowklrubrFtqnVeDQM0rz42NiY2/S
2jk6q8b9XKvoDEev2VeKfmFztu4gGXypNR6xAolvgBnUZziKKFUJutZwJe95pZV1
kO6CvdchuCQdLMLdHPji2kk2AA8Go5webONfioFhoWYfeSvgjRELmOFTDjP22eNv
s1/vfCsjuU6vdyaXXXZpN/7VMenoNegSQUXFoxhmHO7BtNuTqk+ZVFNgaWsOrXcx
I0ZIjCAmOsU+DKLn+DTT8kDtU3ihZz/OCXXZJi9JPeLFZ6gSghDRnFThwYqcZvA+
syIn1XGGtn15Q7A5FZNvwGqYir1GtyOslrxxIKxGlhleovhT94HuKXkL1T+k74zl
FXItKgRs0TPQfTsI8q+C
=IMcz
-----END PGP SIGNATURE-----
Merge tag 'locks-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux
Pull file locking updates from Jeff Layton:
"Just a couple of patches from Konstantin to fix /proc/locks when the
process that set the lock has exited, and a new tracepoint for the
flock() codepath. Also threw in mailmap entries for my addresses and a
comment cleanup"
* tag 'locks-v4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux:
locks: remove misleading obsolete comment
mailmap: remap some of my email addresses to kernel.org address
locks: add tracepoint in flock codepath
fs/lock: show locks taken by processes from another pidns
fs/lock: skip lock owner pid translation in case we are in init_pid_ns
There is an unalignment access about the structure
'trace_event_raw_fib_table_lookup'.
In include/trace/events/fib.h, there is a memory operation which casting
the 'src' data member to a pointer, and then store a value to this
pointer point to.
p32 = (__be32 *) __entry->src;
*p32 = flp->saddr;
The offset of 'src' in structure trace_event_raw_fib_table_lookup is not
four bytes alignment. On some architectures, they don't permit the
unalignment access, it need to pay the price to handle this situation in
exception handler.
Adjust the layout of structure to avoid this case.
Fixes: 9f323973c9 ("net/ipv4: Udate fib_table_lookup tracepoint")
Signed-off-by: Zong Li <zong@andestech.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We used to call btrfs_file_extent_inline_len() to get the uncompressed
data size of an inlined extent.
However this function is hiding evil, for compressed extent, it has no
choice but to directly read out ram_bytes from btrfs_file_extent_item.
While for uncompressed extent, it uses item size to calculate the real
data size, and ignoring ram_bytes completely.
In fact, for corrupted ram_bytes, due to above behavior kernel
btrfs_print_leaf() can't even print correct ram_bytes to expose the bug.
Since we have the tree-checker to verify all EXTENT_DATA, such mismatch
can be detected pretty easily, thus we can trust ram_bytes without the
evil btrfs_file_extent_inline_len().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is no longer used anywhere, remove all of it.
Signed-off-by: Josef Bacik <jbacik@fb.com>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Fix the ACK proposal tracepoint outcomes list by making the one that's an
empty string not an empty string - which gets rendered as a hex number
string instead.
Signed-off-by: David Howells <dhowells@redhat.com>
Trace successful packet transmission (kernel_sendmsg() succeeded, that is)
in AF_RXRPC. We can share the enum that defines the transmission points
with the trace_rxrpc_tx_fail() tracepoint, so rename its constants to be
applicable to both.
Also, save the internal call->debug_id in the rxrpc_channel struct so that
it can be used in retransmission trace lines.
Signed-off-by: David Howells <dhowells@redhat.com>
This patch detaches the preemptirq tracepoints from the tracers and
keeps it separate.
Advantages:
* Lockdep and irqsoff event can now run in parallel since they no longer
have their own calls.
* This unifies the usecase of adding hooks to an irqsoff and irqson
event, and a preemptoff and preempton event.
3 users of the events exist:
- Lockdep
- irqsoff and preemptoff tracers
- irqs and preempt trace events
The unification cleans up several ifdefs and makes the code in preempt
tracer and irqsoff tracers simpler. It gets rid of all the horrific
ifdeferry around PROVE_LOCKING and makes configuration of the different
users of the tracepoints more easy and understandable. It also gets rid
of the time_* function calls from the lockdep hooks used to call into
the preemptirq tracer which is not needed anymore. The negative delta in
lines of code in this patch is quite large too.
In the patch we introduce a new CONFIG option PREEMPTIRQ_TRACEPOINTS
as a single point for registering probes onto the tracepoints. With
this,
the web of config options for preempt/irq toggle tracepoints and its
users becomes:
PREEMPT_TRACER PREEMPTIRQ_EVENTS IRQSOFF_TRACER PROVE_LOCKING
| | \ | |
\ (selects) / \ \ (selects) /
TRACE_PREEMPT_TOGGLE ----> TRACE_IRQFLAGS
\ /
\ (depends on) /
PREEMPTIRQ_TRACEPOINTS
Other than the performance tests mentioned in the previous patch, I also
ran the locking API test suite. I verified that all tests cases are
passing.
I also injected issues by not registering lockdep probes onto the
tracepoints and I see failures to confirm that the probes are indeed
working.
This series + lockdep probes not registered (just to inject errors):
[ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
[ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
[ 0.000000] sirq-safe-A => hirqs-on/12:FAILED|FAILED| ok |
[ 0.000000] sirq-safe-A => hirqs-on/21:FAILED|FAILED| ok |
[ 0.000000] hard-safe-A + irqs-on/12:FAILED|FAILED| ok |
[ 0.000000] soft-safe-A + irqs-on/12:FAILED|FAILED| ok |
[ 0.000000] hard-safe-A + irqs-on/21:FAILED|FAILED| ok |
[ 0.000000] soft-safe-A + irqs-on/21:FAILED|FAILED| ok |
[ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok |
[ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok |
With this series + lockdep probes registered, all locking tests pass:
[ 0.000000] hard-irqs-on + irq-safe-A/21: ok | ok | ok |
[ 0.000000] soft-irqs-on + irq-safe-A/21: ok | ok | ok |
[ 0.000000] sirq-safe-A => hirqs-on/12: ok | ok | ok |
[ 0.000000] sirq-safe-A => hirqs-on/21: ok | ok | ok |
[ 0.000000] hard-safe-A + irqs-on/12: ok | ok | ok |
[ 0.000000] soft-safe-A + irqs-on/12: ok | ok | ok |
[ 0.000000] hard-safe-A + irqs-on/21: ok | ok | ok |
[ 0.000000] soft-safe-A + irqs-on/21: ok | ok | ok |
[ 0.000000] hard-safe-A + unsafe-B #1/123: ok | ok | ok |
[ 0.000000] soft-safe-A + unsafe-B #1/123: ok | ok | ok |
Link: http://lkml.kernel.org/r/20180730222423.196630-4-joel@joelfernandes.org
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
systrace used for tracing for Android systems has carried a patch for
many years in the Android tree that traces when the cpufreq limits
change. With the help of this information, systrace can know when the
policy limits change and can visually display the data. Lets add
upstream support for the same.
Signed-off-by: Ruchi Kandoi <kandoiruchi@google.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The Aspeed AST2x00 can contain a ColdFire v1 coprocessor which
is currently unused on OpenPower systems.
This adds an alternative to the fsi-master-gpio driver that
uses that coprocessor instead of bit banging from the ARM
core itself. The end result is about 4 times faster.
The firmware for the coprocessor and its source code can be
found at https://github.com/ozbenh/cf-fsi and is system specific.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Now that quiescent states for newly offlined CPUs are reported either
when that CPU goes offline or at the end of grace-period initialization,
the CPU-hotplug failsafe in the force-quiescent-state code path is no
longer needed.
This commit therefore removes this failsafe.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>