linux_dsm_epyc7002/include
Shaohua Li 9e234eeafb blk-throttle: add a simple idle detection
A cgroup gets assigned a low limit, but the cgroup could never dispatch
enough IO to cross the low limit. In such case, the queue state machine
will remain in LIMIT_LOW state and all other cgroups will be throttled
according to low limit. This is unfair for other cgroups. We should
treat the cgroup idle and upgrade the state machine to lower state.

We also have a downgrade logic. If the state machine upgrades because of
cgroup idle (real idle), the state machine will downgrade soon as the
cgroup is below its low limit. This isn't what we want. A more
complicated case is cgroup isn't idle when queue is in LIMIT_LOW. But
when queue gets upgraded to lower state, other cgroups could dispatch
more IO and this cgroup can't dispatch enough IO, so the cgroup is below
its low limit and looks like idle (fake idle). In this case, the queue
should downgrade soon. The key to determine if we should do downgrade is
to detect if cgroup is truely idle.

Unfortunately it's very hard to determine if a cgroup is real idle. This
patch uses the 'think time check' idea from CFQ for the purpose. Please
note, the idea doesn't work for all workloads. For example, a workload
with io depth 8 has disk utilization 100%, hence think time is 0, eg,
not idle. But the workload can run higher bandwidth with io depth 16.
Compared to io depth 16, the io depth 8 workload is idle. We use the
idea to roughly determine if a cgroup is idle.

We treat a cgroup idle if its think time is above a threshold (by
default 1ms for SSD and 100ms for HD). The idea is think time above the
threshold will start to harm performance. HD is much slower so a longer
think time is ok.

The patch (and the latter patches) uses 'unsigned long' to track time.
We convert 'ns' to 'us' with 'ns >> 10'. This is fast but loses
precision, should not a big deal.

Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-28 08:02:20 -06:00
..
acpi
asm-generic mm: convert generic code to 5-level paging 2017-03-09 11:48:47 -08:00
clocksource
crypto net: Work around lockdep limitation in sockets that use sockets 2017-03-09 18:23:27 -08:00
drm sched/headers: Prepare to remove the <linux/mm_types.h> dependency from <linux/sched.h> 2017-03-02 08:42:37 +01:00
dt-bindings scripts/spelling.txt: add "overide" pattern and fix typo instances 2017-03-09 17:01:09 -08:00
keys KEYS: Differentiate uses of rcu_dereference_key() and user_key_payload() 2017-03-02 10:09:00 +11:00
kvm
linux blk-throttle: add a simple idle detection 2017-03-28 08:02:20 -06:00
math-emu
media media fixes for v4.11-rc2 2017-03-09 15:50:56 -08:00
memory
misc
net Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2017-03-14 21:31:23 -07:00
pcmcia
ras
rdma sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
rxrpc
scsi scsi: mpt3sas: Avoid sleeping in interrupt context 2017-03-01 21:52:13 -05:00
soc ARC updates for 4.11 rc1 2017-02-22 10:33:53 -08:00
sound sched/headers: Prepare to remove spurious <linux/sched.h> inclusion dependencies 2017-03-02 08:42:41 +01:00
target target: fix ALUA transition timeout handling 2017-03-18 14:47:28 -07:00
trace There was some breakage with the changes for jump labels in the 4.11 merge 2017-03-07 09:37:28 -08:00
uapi amd, intel, tilcdc, omap, and malidp fixes. 2017-03-17 11:19:52 -07:00
video
xen Merge branch 'stable/for-linus-4.11' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb 2017-03-07 10:23:17 -08:00
Kbuild