linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-11-24 22:30:52 +07:00

Author	SHA1	Message	Date
Corrado Zoccolo	c0324a020e	cfq-iosched: reimplement priorities using different service trees We use different service trees for different priority classes. This allows a simplification in the service tree insertion code, that no longer has to consider priority while walking the tree. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-28 09:23:26 +01:00
Corrado Zoccolo	aa6f6a3de1	cfq-iosched: preparation to handle multiple service trees We embed a pointer to the service tree in each queue, to handle multiple service trees easily. Service trees are enriched with a counter. cfq_add_rq_rb is invoked after putting the rq in the fifo, to ensure that all fields in rq are properly initialized. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-28 09:23:26 +01:00
Corrado Zoccolo	5db5d64277	cfq-iosched: adapt slice to number of processes doing I/O When the number of processes performing I/O concurrently increases, a fixed time slice per process will cause large latencies. This patch, if low_latency mode is enabled, will scale the time slice assigned to each process according to a 300ms target latency. In order to keep fairness among processes: * The number of active processes is computed using a special form of running average, that quickly follows sudden increases (to keep latency low), and decrease slowly (to have fairness in spite of rapid decreases of this value). To safeguard sequential bandwidth, we impose a minimum time slice (computed using 2*cfq_slice_idle as base, adjusted according to priority and async-ness). Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-28 09:23:26 +01:00
Shaohua Li	1a1238a7dd	cfq-iosched: improve hw_tag detection If active queue hasn't enough requests and idle window opens, cfq will not dispatch sufficient requests to hardware. In such situation, current code will zero hw_tag. But this is because cfq doesn't dispatch enough requests instead of hardware queue doesn't work. Don't zero hw_tag in such case. Signed-off-by: Shaohua Li <shaohua.li@intel.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-27 08:46:23 +01:00
Jeff Moyer	e6c5bc737a	cfq: break apart merged cfqqs if they stop cooperating cfq_queues are merged if they are issuing requests within the mean seek distance of one another. This patch detects when the coopearting stops and breaks the queues back up. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-26 14:34:47 +01:00
Jeff Moyer	b3b6d0408c	cfq: change the meaning of the cfqq_coop flag The flag used to indicate that a cfqq was allowed to jump ahead in the scheduling order due to submitting a request close to the queue that just executed. Since closely cooperating queues are now merged, the flag holds little meaning. Change it to indicate that multiple queues were merged. This will later be used to allow the breaking up of merged queues when they are no longer cooperating. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-26 14:34:47 +01:00
Jeff Moyer	df5fe3e8e1	cfq: merge cooperating cfq_queues When cooperating cfq_queues are detected currently, they are allowed to skip ahead in the scheduling order. It is much more efficient to automatically share the cfq_queue data structure between cooperating processes. Performance of the read-test2 benchmark (which is written to emulate the dump(8) utility) went from 12MB/s to 90MB/s on my SATA disk. NFS servers with multiple nfsd threads also saw performance increases. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-26 14:34:47 +01:00
Jeff Moyer	b2c18e1e08	cfq: calculate the seek_mean per cfq_queue not per cfq_io_context async cfq_queue's are already shared between processes within the same priority, and forthcoming patches will change the mapping of cic to sync cfq_queue from 1:1 to 1:N. So, calculate the seekiness of a process based on the cfq_queue instead of the cfq_io_context. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-26 14:34:46 +01:00
Corrado Zoccolo	355b659c87	cfq-iosched: avoid probable slice overrun when idling If the average think time is larger than the remaining time slice for any given queue, don't allow it to idle. A succesful idle also means that we need to dispatch and complete a request, so if we don't even have time left for the idle process, we would overrun the slice in any case. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-08 08:43:32 +02:00
Jens Axboe	a6151c3a5c	cfq-iosched: apply bool value where we return 0/1 Saves 16 bytes of text, woohoo. But the more important point is that it makes the code more readable when returning bool for 0/1 cases. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-07 20:02:57 +02:00
Corrado Zoccolo	ec60e4f674	cfq-iosched: fix think time allowed for seekers CFQ enables idle only for processes that think less than the allowed idle time. Since idle time is lower for seeky queues, we should use the correct value in the comparison. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-07 19:51:54 +02:00
Jens Axboe	b9c8946b19	cfq-iosched: fix the slice residual sign We should subtract the slice residual from the rb tree key, since a negative residual count indicates that the cfqq overran its slice the last time. Hence we want to add the overrun time, to position it a bit further away in the service tree. Reported-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-06 21:09:32 +02:00
Jens Axboe	0b182d617e	cfq-iosched: abstract out the 'may this cfqq dispatch' logic Makes the whole thing easier to read, cfq_dispatch_requests() was a bit messy before. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-06 20:49:37 +02:00
Jens Axboe	23e018a1b0	block: get rid of kblock_schedule_delayed_work() It was briefly introduced to allow CFQ to to delayed scheduling, but we ended up removing that feature again. So lets kill the function and export, and just switch CFQ back to the normal work schedule since it is now passing in a '0' delay from all call sites. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-05 11:03:58 +02:00
Corrado Zoccolo	48e025e63a	cfq-iosched: fix possible problem with jiffies wraparound The RR service tree is indexed by a key that is relative to current jiffies. This can cause problems on jiffies wraparound. The patch fixes it using time_before comparison, and changing the add_front path to use a relative number, too. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-05 11:03:55 +02:00
Jens Axboe	30996f40bf	cfq-iosched: fix issue with rq-rq merging and fifo list ordering cfq uses rq->start_time as the fifo indicator, but that field may get modified prior to cfq doing it's fifo list adjustment when a request gets merged with another request. This can cause the fifo list to become unordered. Reported-by: Corrado Zoccolo <czoccolo@gmail.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-05 11:03:39 +02:00
Jens Axboe	e00c54c36a	cfq-iosched: don't delay async queue if it hasn't dispatched at all We cannot delay for the first dispatch of the async queue if it hasn't dispatched at all, since that could present a local user DoS attack vector using an app that just did slow timed sync reads while filling memory. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-04 20:36:19 +02:00
Jens Axboe	61f0c1dcaa	cfq-iosched: use assigned slice sync value, not default We should use the sysfs modified slice sync value, in case it differs from the default. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-03 19:46:03 +02:00
Jens Axboe	963b72fc66	cfq-iosched: rename 'desktop' sysfs entry to 'low_latency' Don't think that's necessarily a perfect description of what this option fiddles with, but it's probably better than 'desktop'. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-03 19:42:18 +02:00
Jens Axboe	8e29675555	cfq-iosched: implement slower async initiate and queue ramp up This slowly ramps up the async queue depth based on the time passed since the sync IO, and doesn't allow async at all until a sync slice period has passed. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-03 16:27:13 +02:00
Vivek Goyal	365722bb91	cfq-iosched: delay async IO dispatch, if sync IO was just done o Do not allow more than max_dispatch requests from an async queue, if some sync request has finished recently. This is in the hope that sync activity is still going on in the system and we might receive a sync request soon. Most likely from a sync queue which finished a request and we did not enable idling on it. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-03 15:21:27 +02:00
Jens Axboe	1d2235152d	cfq-iosched: add a knob for desktop interactiveness This is basically identical to what Vivek Goyal posted, but combined into one and labelled 'desktop' instead of 'fairness'. The goal is to continue to improve on the latency side of things as it relates to interactiveness, keeping the questionable bits under this sysfs tunable so it would be easy for throughput-only people to turn off. Apart from adding the interactive sysfs knob, it also adds the behavioural change of allowing slice idling even if the hardware does tagged command queuing. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-10-02 20:06:02 +02:00
Linus Torvalds	ada3fa1505	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (46 commits) powerpc64: convert to dynamic percpu allocator sparc64: use embedding percpu first chunk allocator percpu: kill lpage first chunk allocator x86,percpu: use embedding for 64bit NUMA and page for 32bit NUMA percpu: update embedding first chunk allocator to handle sparse units percpu: use group information to allocate vmap areas sparsely vmalloc: implement pcpu_get_vm_areas() vmalloc: separate out insert_vmalloc_vm() percpu: add chunk->base_addr percpu: add pcpu_unit_offsets[] percpu: introduce pcpu_alloc_info and pcpu_group_info percpu: move pcpu_lpage_build_unit_map() and pcpul_lpage_dump_cfg() upward percpu: add @align to pcpu_fc_alloc_fn_t percpu: make @dyn_size mandatory for pcpu_setup_first_chunk() percpu: drop @static_size from first chunk allocators percpu: generalize first chunk allocator selection percpu: build first chunk allocators selectively percpu: rename 4k first chunk allocator to page percpu: improve boot messages percpu: fix pcpu_reclaim() locking ... Fix trivial conflict as by Tejun Heo in kernel/sched.c	2009-09-15 09:39:44 -07:00
Jeff Moyer	06d2188644	cfq: choose a new next_req when a request is dispatched This patch addresses http://bugzilla.kernel.org/show_bug.cgi?id=13401, a regression introduced in 2.6.30. From the bug report: Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-14 08:24:52 +02:00
Shan Wei	b217a903ab	cfq: fix the log message after dispatched a request The blktrace tools can show process id when cfq dispatched a request, using cfq_log_cfqq() instead of cfq_log(). Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:34:33 +02:00
Jens Axboe	1b379d8daf	cfq-iosched: get rid of must_alloc flag It's not currently used, as pointed out by Gui Jianfeng <guijianfeng@cn.fujitsu.com>. We already check the wait_request flag to allow an idling queue priority allocation access, so we don't need this extra flag. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:32 +02:00
Jens Axboe	1f98a13f62	bio: first step in sanitizing the bio->bi_rw flag testing Get rid of any functions that test for these bits and make callers use bio_rw_flagged() directly. Then it is at least directly apparent what variable and flag they check. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:31 +02:00
Vivek Goyal	d58b85e1e8	cfq-iosched: no need to keep track of busy_rt_queues o Get rid of busy_rt_queues infrastructure. Looks like it is redundant. o Once an RT queue gets request it will preempt any of the BE or IDLE queues immediately. Otherwise this queue will be put on service tree and scheduler will anyway select this queue before any of the BE or IDLE queue. Hence looks like there is no need to keep track of how many busy RT queues are currently on service tree. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:30 +02:00
Jens Axboe	5ad531db6e	cfq-iosched: drain device queue before switching to a sync queue To lessen the impact of async IO on sync IO, let the device drain of any async IO in progress when switching to a sync cfqq that has idling enabled. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-09-11 14:33:30 +02:00
Tejun Heo	384be2b18a	Merge branch 'percpu-for-linus' into percpu-for-next Conflicts: arch/sparc/kernel/smp_64.c arch/x86/kernel/cpu/perf_counter.c arch/x86/kernel/setup_percpu.c drivers/cpufreq/cpufreq_ondemand.c mm/percpu.c Conflicts in core and arch percpu codes are mostly from commit ed78e1e078dd44249f88b1dd8c76dafb39567161 which substituted many num_possible_cpus() with nr_cpu_ids. As for-next branch has moved all the first chunk allocators into mm/percpu.c, the changes are moved from arch code to mm/percpu.c. Signed-off-by: Tejun Heo <tj@kernel.org>	2009-08-14 14:45:31 +09:00
Vivek Goyal	32f2e807a3	cfq-iosched: reset oom_cfqq in cfq_set_request() In case memory is scarce, we now default to oom_cfqq. Once memory is available again, we should allocate a new cfqq and stop using oom_cfqq for a particular io context. Once a new request comes in, check if we are using oom_cfqq, and if yes, try to allocate a new cfqq. Tested the patch by forcing the use of oom_cfqq and upon next request thread realized that it was using oom_cfqq and it allocated a new cfqq. Signed-off-by: Vivek Goyal <vgoyal@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-10 20:31:54 +02:00
Tejun Heo	c43768cbb7	Merge branch 'master' into for-next Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix changes. As alpha in percpu tree uses 'weak' attribute instead of inline assembly, there's no need for __used attribute. Conflicts: arch/alpha/include/asm/percpu.h arch/mn10300/kernel/vmlinux.lds.S include/linux/percpu-defs.h	2009-07-04 07:13:18 +09:00
Shan Wei	b706f64281	cfq-iosched: remove redundant check for NULL cfqq in cfq_set_request() With the changes for falling back to an oom_cfqq, we never fail to find/allocate a queue in cfq_get_queue(). So remove the check. Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-01 12:41:14 +02:00
Jens Axboe	6118b70b3a	cfq-iosched: get rid of the need for __GFP_NOFAIL in cfq_find_alloc_queue() Setup an emergency fallback cfqq that we allocate at IO scheduler init time. If the slab allocation fails in cfq_find_alloc_queue(), we'll just punt IO to that cfqq instead. This ensures that cfq_find_alloc_queue() never fails without having to ensure free memory. On cfqq lookup, always try to allocate a new cfqq if the given cfq io context has the oom_cfqq assigned. This ensures that we only temporarily punt to this shared queue. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-01 10:56:25 +02:00
Jens Axboe	d5036d770f	cfq-iosched: move cfqq initialization out of cfq_find_alloc_queue() We're going to be needing that init code outside of that function to get rid of the __GFP_NOFAIL in cfqq allocation. Reviewed-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-07-01 10:56:25 +02:00
Tejun Heo	245b2e70ea	percpu: clean up percpu variable definitions Percpu variable definition is about to be updated such that all percpu symbols including the static ones must be unique. Update percpu variable definitions accordingly. * as,cfq: rename ioc_count uniquely * cpufreq: rename cpu_dbs_info uniquely * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and rename it * ipv4,6: rename cookie_scratch uniquely * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to pmc_irq_entry and nmi_entry to pmc_nmi_entry * perf_counter: rename disable_count to perf_disable_count * ftrace: rename test_event_disable to ftrace_test_event_disable * kmemleak: rename test_pointer to kmemleak_test_pointer * mce: rename next_interval to mce_next_interval [ Impact: percpu usage cleanups, no duplicate static percpu var names ] Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christoph Lameter <cl@linux-foundation.org> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Jens Axboe <jens.axboe@oracle.com> Cc: Dave Jones <davej@redhat.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: linux-mm <linux-mm@kvack.org> Cc: David S. Miller <davem@davemloft.net> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Steven Rostedt <srostedt@redhat.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Andi Kleen <andi@firstfloor.org>	2009-06-24 15:13:48 +09:00
Jeff Moyer	6923715ae3	cfq: remove extraneous '\n' in blktrace output I noticed a blank line in blktrace output. This patch fixes that. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-06-16 08:21:04 +02:00
Gui Jianfeng	81be834713	cfq: cleanup for last_end_request in cfq_data Actually, last_end_request in cfq_data isn't used now. So lets just remove it. Signed-off-by: Gui Jianfeng <guijianfeng@cn.fujitsu.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-06-16 08:21:03 +02:00
Nikanth Karthikesan	d9c7d394a8	block: prevent possible io_context->refcount overflow Currently io_context has an atomic_t(32-bit) as refcount. In the case of cfq, for each device against whcih a task does I/O, a reference to the io_context would be taken. And when there are multiple process sharing io_contexts(CLONE_IO) would also have a reference to the same io_context. Theoretically the possible maximum number of processes sharing the same io_context + the number of disks/cfq_data referring to the same io_context can overflow the 32-bit counter on a very high-end machine. Even though it is an improbable case, let us make it atomic_long_t. Signed-off-by: Nikanth Karthikesan <knikanth@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-06-10 23:07:15 +02:00
Tejun Heo	2e46e8b27a	block: drop request->hard_* and nr_sectors struct request has had a few different ways to represent some properties of a request. ->hard_ represent block layer's view of the request progress (completion cursor) and the ones without the prefix are supposed to represent the issue cursor and allowed to be updated as necessary by the low level drivers. The thing is that as block layer supports partial completion, the two cursors really aren't necessary and only cause confusion. In addition, manual management of request detail from low level drivers is cumbersome and error-prone at the very least. Another interesting duplicate fields are rq->[hard_]nr_sectors and rq->{hard_cur\|current}_nr_sectors against rq->data_len and rq->bio->bi_size. This is more convoluted than the hard_ case. rq->[hard_]nr_sectors are initialized for requests with bio but blk_rq_bytes() uses it only for !pc requests. rq->data_len is initialized for all request but blk_rq_bytes() uses it only for pc requests. This causes good amount of confusion throughout block layer and its drivers and determining the request length has been a bit of black magic which may or may not work depending on circumstances and what the specific LLD is actually doing. rq->{hard_cur\|current}_nr_sectors represent the number of sectors in the contiguous data area at the front. This is mainly used by drivers which transfers data by walking request segment-by-segment. This value always equals rq->bio->bi_size >> 9. However, data length for pc requests may not be multiple of 512 bytes and using this field becomes a bit confusing. In general, having multiple fields to represent the same property leads only to confusion and subtle bugs. With recent block low level driver cleanups, no driver is accessing or manipulating these duplicate fields directly. Drop all the duplicates. Now rq->sector means the current sector, rq->data_len the current total length and rq->bio->bi_size the current segment length. Everything else is defined in terms of these three and available only through accessors. * blk_recalc_rq_sectors() is collapsed into blk_update_request() and now handles pc and fs requests equally other than rq->sector update. This means that now pc requests can use partial completion too (no in-kernel user yet tho). * bio_cur_sectors() is replaced with bio_cur_bytes() as block layer now uses byte count as the primary data length. * blk_rq_pos() is now guranteed to be always correct. In-block users converted. * blk_rq_bytes() is now guaranteed to be always valid as is blk_rq_sectors(). In-block users converted. * blk_rq_sectors() is now guaranteed to equal blk_rq_bytes() >> 9. More convenient one is used. * blk_rq_bytes() and blk_rq_cur_bytes() are now inlined and take const pointer to request. [ Impact: API cleanup, single way to represent one property of a request ] Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:54 +02:00
Tejun Heo	83096ebf12	block: convert to pos and nr_sectors accessors With recent cleanups, there is no place where low level driver directly manipulates request fields. This means that the 'hard' request fields always equal the !hard fields. Convert all rq->sectors, nr_sectors and current_nr_sectors references to accessors. While at it, drop superflous blk_rq_pos() < 0 test in swim.c. [ Impact: use pos and nr_sectors accessors ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Tested-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Grant Likely <grant.likely@secretlab.ca> Tested-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Acked-by: Adrian McMenamin <adrian@mcmen.demon.co.uk> Acked-by: Mike Miller <mike.miller@hp.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Eric Moore <Eric.Moore@lsi.com> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: FUJITA Tomonori <fujita.tomonori@lab.ntt.co.jp> Cc: Pete Zaitcev <zaitcev@redhat.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Paul Clements <paul.clements@steeleye.com> Cc: Tim Waugh <tim@cyberelk.net> Cc: Jeff Garzik <jgarzik@pobox.com> Cc: Jeremy Fitzhardinge <jeremy@xensource.com> Cc: Alex Dubov <oakad@yahoo.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Dario Ballabio <ballabio_dario@emc.com> Cc: David S. Miller <davem@davemloft.net> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: unsik Kim <donari75@gmail.com> Cc: Laurent Vivier <Laurent@lvivier.info> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:54 +02:00
Tejun Heo	5b93629b45	block: implement blk_rq_pos/[cur_]sectors() and convert obvious ones Implement accessors - blk_rq_pos(), blk_rq_sectors() and blk_rq_cur_sectors() which return rq->hard_sector, rq->hard_nr_sectors and rq->hard_cur_sectors respectively and convert direct references of the said fields to the accessors. This is in preparation of request data length handling cleanup. Geert : suggested adding const to struct request * parameter to accessors Sergei : spotted error in patch description [ Impact: cleanup ] Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Acked-by: Stephen Rothwell <sfr@canb.auug.org.au> Tested-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Grant Likely <grant.likely@secretlab.ca> Ackec-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com> Cc: Borislav Petkov <petkovbb@googlemail.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-05-11 09:50:53 +02:00
Tejun Heo	a7f5579234	block: kill blk_start_queueing() blk_start_queueing() is identical to __blk_run_queue() except that it doesn't check for recursion. None of the current users depends on blk_start_queueing() running request_fn directly. Replace usages of blk_start_queueing() with [__]blk_run_queue() and kill it. [ Impact: removal of mostly duplicate interface function ] Signed-off-by: Tejun Heo <tj@kernel.org>	2009-04-28 07:37:33 +02:00
Jens Axboe	f2d1f0ae78	cfq-iosched: cache prio_tree root in cfqq->p_root Currently we look it up from ->ioprio, but ->ioprio can change if either the process gets its IO priority changed explicitly, or if cfq decides to temporarily boost it. So if we are unlucky, we can end up attempting to remove a node from a different rbtree root than where it was added. Fix this by using ->org_ioprio as the prio_tree index, since that will only change for explicit IO priority settings (not for a boost). Additionally cache the rbtree root inside the cfqq, then we don't have to add code to reinsert the cfqq in the prio_tree if IO priority changes. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-04-24 08:54:22 +02:00
Jens Axboe	3ac6c9f8a6	cfq-iosched: fix bug with aliased request and cooperation detection cfq_prio_tree_lookup() should return the direct match, yet it always returns zero. Fix that. cfq_prio_tree_add() assumes that we don't get a direct match, while it is very possible that we do. Using O_DIRECT, you can have different cfqq with matching requests, since you don't have the page cache to serialize things for you. Fix this bug by only adding the cfqq if there isn't an existing match. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-04-24 08:54:22 +02:00
Jens Axboe	26a2ac009c	cfq-iosched: clear ->prio_trees[] on cfqd alloc Not strictly needed, but we should make it clear that we init the rbtree roots here. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-04-24 08:54:22 +02:00
Jeff Moyer	04dc6e71a2	cfq-iosched: use the default seek distance when there aren't enough seek samples If the cfq io context doesn't have enough samples yet to provide a mean seek distance, then use the default threshold we have for seeky IO instead of defaulting to 0. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-04-22 08:35:11 +02:00
Jeff Moyer	4d00aa47e2	cfq-iosched: make seek_mean converge more quickly Right now, depending on the first sector to which a process issues I/O, the seek time may start out way out of whack. So make sure we start with 0 sectors in seek, instead of the offset of the first request issued. Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-04-22 08:35:11 +02:00
Jens Axboe	a36e71f996	cfq-iosched: add close cooperator code If we have processes that are working in close proximity to each other on disk, we don't want to idle wait. Instead allow the close process to issue a request, getting better aggregate bandwidth. The anticipatory scheduler has similar checks, noop and deadline do not need it since they don't care about process <-> io mappings. The code for CFQ is a little more involved though, since we split request queues into per-process contexts. This fixes a performance problem with eg dump(8), since it uses several processes in some silly attempt to speed IO up. Even if dump(8) isn't really a valid case (it should be fixed by using CLONE_IO), there are other cases where we see close processes and where idling ends up hurting performance. Credit goes to Jeff Moyer <jmoyer@redhat.com> for writing the initial implementation. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-04-15 12:15:11 +02:00
Jens Axboe	9481ffdc61	cfq-iosched: log responsible 'cfqq' in idle timer arm Makes it easier to read the traces. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>	2009-04-15 12:14:13 +02:00

1 2 3 4 5

202 Commits