Go to file
Eric Whitney 4b3139576a ext4: shrink race window in ext4_should_retry_alloc()
[ Upstream commit efc61345274d6c7a46a0570efbc916fcbe3e927b ]

When generic/371 is run on kvm-xfstests using 5.10 and 5.11 kernels, it
fails at significant rates on the two test scenarios that disable
delayed allocation (ext3conv and data_journal) and force actual block
allocation for the fallocate and pwrite functions in the test.  The
failure rate on 5.10 for both ext3conv and data_journal on one test
system typically runs about 85%.  On 5.11, the failure rate on ext3conv
sometimes drops to as low as 1% while the rate on data_journal
increases to nearly 100%.

The observed failures are largely due to ext4_should_retry_alloc()
cutting off block allocation retries when s_mb_free_pending (used to
indicate that a transaction in progress will free blocks) is 0.
However, free space is usually available when this occurs during runs
of generic/371.  It appears that a thread attempting to allocate
blocks is just missing transaction commits in other threads that
increase the free cluster count and reset s_mb_free_pending while
the allocating thread isn't running.  Explicitly testing for free space
availability avoids this race.

The current code uses a post-increment operator in the conditional
expression that determines whether the retry limit has been exceeded.
This means that the conditional expression uses the value of the
retry counter before it's increased, resulting in an extra retry cycle.
The current code actually retries twice before hitting its retry limit
rather than once.

Increasing the retry limit to 3 from the current actual maximum retry
count of 2 in combination with the change described above reduces the
observed failure rate to less that 0.1% on both ext3conv and
data_journal with what should be limited impact on users sensitive to
the overhead caused by retries.

A per filesystem percpu counter exported via sysfs is added to allow
users or developers to track the number of times the retry limit is
exceeded without resorting to debugging methods.  This should provide
some insight into worst case retry behavior.

Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Link: https://lore.kernel.org/r/20210218151132.19678-1-enwlinux@gmail.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-04-07 15:00:03 +02:00
arch bpf: Fix fexit trampoline. 2021-04-07 15:00:03 +02:00
block block: recalculate segment count for multi-segment discards correctly 2021-03-30 14:32:06 +02:00
certs certs: Fix blacklist flag type confusion 2021-03-04 11:37:59 +01:00
crypto crypto: mips/poly1305 - enable for all MIPS processors 2021-03-17 17:06:10 +01:00
Documentation KVM: x86: Protect userspace MSR filter with SRCU, and set atomically-ish 2021-03-30 14:31:53 +02:00
drivers xen-blkback: don't leak persistent grants from xen_blkbk_map() 2021-03-30 14:32:09 +02:00
fs ext4: shrink race window in ext4_should_retry_alloc() 2021-04-07 15:00:03 +02:00
include bpf: Fix fexit trampoline. 2021-04-07 15:00:03 +02:00
init kgdb: fix to kill breakpoints on initmem after boot 2021-03-04 11:38:46 +01:00
ipc ipc: adjust proc_ipc_sem_dointvec definition to match prototype 2020-09-05 12:14:29 -07:00
kernel bpf: Fix fexit trampoline. 2021-04-07 15:00:03 +02:00
lib kasan: fix memory corruption in kasan_bitops_tags test 2021-03-17 17:06:25 +01:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm mm/memcg: fix 5.10 backport of splitting page memcg 2021-03-30 14:32:07 +02:00
net mac80211: fix double free in ibss_leave 2021-03-30 14:32:08 +02:00
samples samples, bpf: Add missing munmap in xdpsock 2021-03-17 17:06:12 +01:00
scripts kbuild: dummy-tools: fix inverted tests for gcc 2021-03-30 14:31:50 +02:00
security integrity: double check iint_cache was initialized 2021-03-30 14:31:55 +02:00
sound ALSA: hda: ignore invalid NHLT table 2021-03-30 14:31:48 +02:00
tools perf synthetic events: Avoid write of uninitialized memory when generating PERF_RECORD_MMAP* records 2021-03-30 14:32:06 +02:00
usr Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
virt KVM: Use kvm_pfn_t for local PFN variable in hva_to_pfn_remapped() 2021-02-26 10:13:01 +01:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: docs: ignore sphinx_*/ directories 2020-09-10 10:44:31 -06:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS MAINTAINERS: move the staging subsystem to lists.linux.dev 2021-03-25 09:04:18 +01:00
Makefile Linux 5.10.27 2021-03-30 14:32:09 +02:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.