We disabled the ability to enable this driver back in October of 2013,
we should be able to safely remove it at this point. The initial goal
was to remove it in 3.15, so now is the time.
Signed-off-by: Jens Axboe <axboe@fb.com>
When bch_cache_set_alloc() fails to kzalloc the cache_set, the
asyncronous closure handling tries to dereference a cache_set that
hadn't yet been allocated inside of cache_set_flush() which is called
by __cache_set_unregister() during cleanup. This appears to happen only
during an OOM condition on bcache_register.
Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
Cc: stable@vger.kernel.org
The bch_writeback_thread might BUG_ON in read_dirty() if
dc->sb==BDEV_STATE_DIRTY and bch_sectors_dirty_init has not yet completed
its related initialization. This patch downs the dc->writeback_lock until
after initialization is complete, thus preventing bch_writeback_thread
from proceeding prematurely.
See this thread:
http://thread.gmane.org/gmane.linux.kernel.bcache.devel/3453
Signed-off-by: Eric Wheeler <bcache@linux.ewheeler.net>
Tested-by: Marc MERLIN <marc@merlins.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
The NVMe specification does not require discarded blocks return zeroes on
read, but provides that behavior as a possibility. Some applications more
efficiently use an SSD if reads on discarded blocks were deterministically
zero, based on the "discard_zeroes_data" queue attribute.
There is no specification defined way to determine device behavior on
discarded blocks, so the driver always left the queue setting disabled. We
can only know behavior based on individual device models, so this patch
adds a flag to the NVMe "quirk" list that vendors may set if they know
their controller works that way. The patch also sets the new flag for one
such known device.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Suggested-by: Artur Paszkiewicz <artur.paszkiewicz@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The do_div() macro now checks its arguments for the correct type,
and refuses anything other than u64, so we get a warning about
nbd_ioctl passing in an loff_t:
drivers/block/nbd.c: In function '__nbd_ioctl':
drivers/block/nbd.c:757:77: error: comparison of distinct pointer types lacks a cast [-Werror]
This changes the nbd code to use div_s64() instead, which takes
a signed argument.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 37091fdd83 ("nbd: Create size change events for userspace")
Signed-off-by: Jens Axboe <axboe@fb.com>
In rrpc, some calculations assume a certain configuration (e.g., 1 LUN,
1 sector per page). The reason behind this was that LightNVM used a
simple configuration with QEMU to test core features in the beginning.
This patch relaxes these assumptions and generalizes calculation,
allowing multiple luns to be used.
Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The struct nvm_dev->total_blocks was only used for calculating total
sectors. Remove and instead calculate total sectors from the number of
luns and its sectors.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The struct rrpc->nr_pages can easily be interpreted as the number of
flash pages allocated to rrpc, while it is the nr_sects. Make sure that
this is reflected from the variable name.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
When an I/O finishes, full blocks are moved from the open to the closed
list - a lock is taken to protect the list. This happens at the moment
in the interrupt context, which is not correct.
This patch moves this logic to the block workqueue instead, avoiding
holding a spinlock without interrupt save in an interrupt context.
Signed-off-by: Javier González <javier@cnexlabs.com>
Fixes: ff0e498bfa ("lightnvm: manage open and closed blocks sepa...")
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
The processes names are truncated to 17, while we had the length
of the process as name 20 - which meant that while we filled
it out with various details - the last 3 characters (which had
the queue number) never surfaced to the user-space.
To simplify this and be able to fit the device name, domain id,
and the queue number we remove the 'blkback' from the name.
Prior to this patch the device name is "blkback.<domid>.<name>"
for example: blkback.8.xvda, blkback.11.hda.
With the multiqueue block backend we add "-%d" for the queue.
But sadly this is already way past the limit so it gets stripped.
Possible solution had been identified by Ian:
http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg03516.html
"
If you are pressed for space then the "xvd" is probably a bit redundant
in a string which starts blkbk.
The guest may not even call the device xvdN (iirc BSD has another
prefix) any how, so having blkback say so seems of limited use anyway.
Since this seems to not include a partition number how does this work in
the split partition scheme? (i.e. one where the guest is given xvda1 and
xvda2 rather than xvda with a partition table)
[It will be 'blkback.8.xvda1', and 'blkback.11.xvda2']
Perhaps something derived from one of the schemes in
http://xenbits.xen.org/docs/unstable/misc/vbd-interface.txt might be a
better fit?
After a bit of discussion (see
http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg01588.html)
we settled on dropping the "blback" part.
This will make it possible to have the <domid>.<name>-<queue>:
[1.xvda-0]
[1.xvda-1]
And we enough space to make it go up to:
[32100.xvdfg9-5]
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
When the media manager runs in dual or quad plane mode, lightnvm
abstracts away plane specific commands. This poses a problem for
get bad block table, as it reports bad blocks per plane, making the
table either two or four times bigger than expected. Fold the bad block
list before returning.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
Instead of checking a constant 0 actually check the space available. Even
better remember to allow for the header and also check the right amount of
space is needed.
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
There's no reason to defer this until the connect phase, and in fact
there are frontend implementations expecting this to be available
earlier. Move it into the probe function.
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Bob Liu <bob.liu@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
"max" is rather ambiguous and carries pretty little meaning, the more
that there are also "max_queues" and "max_ring_page_order". Make this
"max_indirect_segments" instead, and at once change the type from int
to uint (to match the respective variable's type).
Acked-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Fail all pending requests after surprise removal of a drive.
Signed-off-by: Vignesh Gunasekaran <vgunasekaran@micron.com>
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Allow device initialization to finish gracefully when it is in
FTL rebuild failure state. Also, recover device out of this state
after successfully secure erasing it.
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Vignesh Gunasekaran <vgunasekaran@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Flush inflight IOs using fsync_bdev() when the device is safely
removed. Also, block further IOs in device open function.
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Rajesh Kumar Sambandam <rsambandam@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
When FTL rebuild is in progress, alloc_disk() initializes the disk
but device node will be created by add_disk() only after successful
completion of FTL rebuild. So, skip deletion of device node in
removal path when FTL rebuild is in progress.
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Prevent standby immediate command from being issued in remove,
suspend and shutdown paths, while drive is in FTL rebuild process.
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Vignesh Gunasekaran <vgunasekaran@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Print exact time when an internal command is interrupted.
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Rajesh Kumar Sambandam <rsambandam@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Remove setting and clearing MTIP_PF_EH_ACTIVE_BIT flag in
mtip_handle_tfe() as they are redundant. Also avoid waking
up service thread from mtip_handle_tfe() because it is
already woken up in case of taskfile error.
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Rajesh Kumar Sambandam <rsambandam@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
Service thread does not detect the need for taskfile error hanlding. Fixed the
flag condition to process taskfile error.
Signed-off-by: Selvan Mani <smani@micron.com>
Signed-off-by: Asai Thambi S P <asamymuthupa@micron.com>
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@fb.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJWycC6AAoJEEpcgKtcEGQQTrgQAK0nUQHoaVdxgQ3ZL7/NH6VX
4XmnMrcZoaBc1Z+X4VMNSCBlthTpFI2jAKRDMeugtCioZRSo0BUy1AqFwA9SxRj5
VK2uifbJwu5r19UXIk7tr1x01N9tqKA64ywFh+KhZQQ3vdcrY6A4f0agC3SL9As1
4K6GfTvpBzOSXybrXxdi1fDoZQXNQQvhDnn/NuBSarFxzTnw9Eh3Ixiq6JytGKDb
XYpKy4joGxxyRApoDKiqbutq6aUxo0jtenNZrl+tBdpO0lZyLO0T5mCwfEJHXFIH
lwcrVsjPbs9Uk1D1m90eJjA3OOxwZfaNyRAhjLxVsfB4KrDsvYq7mrM63DuKh7AC
6DRFw5gsZRZPjeXD7tcme5IlmXM+1+Li+U3Q7qW3jW1BjEe/M4eC0rpQb2o+bD3M
ZBrFLEKYWnVOuXuBUMOtuBEt6hmh9e+UYV+uVB+fJJyI+nDcNZ8A+DtSl1KOzcEC
7rA7Q9Ck2aVmSKTc+43IZLfzGvkDHB3lfsAUdfp6bzO2rss94xz3KEDXull67I9D
d8zPfBgLvZS40D/O/GG9yGmp4xwahDu/99pGa8EPd6dOvdFgniT7zAla7oXahe4z
8cZ1aStHUKTxlmCygo+Wsbxad0WlM/bD9fx4TR6N1dg5ZQ3sZRRD56nvHefLbui1
AslkYJqK8i3USOVjKor4
=mP8X
-----END PGP SIGNATURE-----
Merge tag 'nbd-for-4.6' of git://git.pengutronix.de/git/mpa/linux-nbd into for-4.6/drivers
NBD for 4.6
Markus writes:
This pull request contains 7 patches for 4.6.
Patch 1 fixes some unnecessarily complicated code I introduced some versions
ago for debugfs.
Patch 2 removes the criticised signal usage within NBD to kill the NBD threads
after a timeout. This code was used for the last years and is now replaced by
simply killing the tcp connection.
Patches 3-6 are some smaller cleanups.
Patch 7 uevents for the userspace. This way udev/systemd can react on connected
NBD devices.
For NVMe over Fabrics, the cntlid will be used by systemd/udev to
create link to the device, for example,
/dev/disk/by-path/<fabrics-info>-<cntlid>-<namespace> -> /dev/nvme0n1
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Both LighNVM and NVMe over Fabrics need to look at more than just the
status and result field.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Matias Bj?rling <m@bjorling.me>
Reviewed-by: Jay Freyensee <james.p.freyensee@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The only work left in the kthread is the periodic health check for each
controller. There is no need to run this from process context or keep
a thread context around for it, so replace it with a simpler timer.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
There is no reason to do unconditional polling of CQs per the NVMe
spec.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
Use a dedicated work item to submit async event requests instead of the
global kthread. This simplifies the code and reduces the latencies to
resubmit a request once an even notification happened.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
The userspace needs to know when nbd devices are ready for use.
Currently no events are created for the userspace which doesn't work for
systemd.
See the discussion here: https://github.com/systemd/systemd/pull/358
This patch uses a central point to setup the nbd-internal sizes. A ioctl
to set a size does not lead to a visible size change. The size of the
block device will be kept at 0 until nbd is connected. As soon as it
connects, the size will be changed to the real value and a uevent is
created. When disconnecting, the blockdevice is set to 0 size and
another uevent is generated.
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
NVMe over Fabrics drivers are going to reuse the core,
so splits nvme.ko into 2 modules:
nvme-core.ko: the core part
nvme.ko: the PCI driver
Export symbols from nvme-core.ko.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Split dev_list_lock into one in the core and one in the PCI driver.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
These variables are used by PCI driver and will also be used in the
forthcoming NVMe over Fabrics drivers.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Sagi Grimberg <sagig@mellanox.com>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
We don't want to be able to unload the fabric driver when we have
openened referenced to our namespaces. Thus, for each nvme_open we
take a reference on the fabric driver and put it in nvme_release.
This behavior is consistent with the scsi model.
This resolves the panic when unloading a fabric module with
mpath holders.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ian Bakshan <ianb@mellanox.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Ming Lin <ming.l@ssi.samsung.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Having the ctrl name "nvmeX" seems much more friendly than
the underlying device name. Also, with other nvme transports
such as the soon to come nvme-loop we don't have an underlying
device so it doesn't makes sense to make up one.
In order to help matching an instance name to a pci function,
we add a info print in nvme_probe.
Signed-off-by: Sagi Grimberg <sagig@mellanox.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Manually fixed up the hunk in nvme_cancel_queue_ios().
Signed-off-by: Jens Axboe <axboe@fb.com>
Pass the right private data to device_create_with_groups from the
beginning, and remove the superflous call to dev_set_drvdata.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
This notifies blk-mq when the tag set contains a different number of
queues prior to freeing unused ones that the request queue points to.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
The hardware's provided queue count may change at runtime with resource
provisioning. This patch allows a block driver to alter the number of
h/w queues available when its resource count changes.
The main part is a new blk-mq API to request a new number of h/w queues
for a given live tag set. The new API freezes all queues using that set,
then adjusts the allocated count prior to remapping these to CPUs.
The bulk of the rest just shifts where h/w contexts and all their
artifacts are allocated and freed.
The number of max h/w contexts is capped to the number of possible cpus
since there is no use for more than that. As such, all pre-allocated
memory for pointers need to account for the max possible rather than
the initial number of queues.
A side effect of this is that the blk-mq will proceed successfully as
long as it can allocate at least one h/w context. Previously it would
fail request queue initialization if less than the requested number
was allocated.
Signed-off-by: Keith Busch <keith.busch@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Make the "Attempted send on closed socket" error messages generated in
nbd_request_handler() ratelimited.
When the nbd socket is shutdown, the nbd_request_handler() function emits
an error message for every request remaining in its queue. If the queue
is large, this will spam a large amount of messages to the log. There's
no need for a separate error message for each request, so this patch
ratelimits it.
In the specific case this was found, the system was virtual and the error
messages were logged to the serial port, which overwhelmed it.
Fixes: 4d48a542b4 ("nbd: fix I/O hang on disconnected nbds")
Signed-off-by: Dan Streetman <dan.streetman@canonical.com>
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
nbd changes properties of the blockdevice depending on flags that were
received. This patch moves this flag parsing into a separate function
nbd_parse_flags().
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Group all variables that are reset after a disconnect into reset
functions. This patch adds two of these functions, nbd_reset() and
nbd_bdev_reset().
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
It may be useful to know in the client that a connection timed out. The
current code returns success for a timeout.
This patch reports the error code -ETIMEDOUT for a timeout.
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
As discussed on the mailing list, the usage of signals for timeout
handling has a lot of potential issues. The nbd driver used for some
time signals for timeouts. These signals where able to get the threads
out of the blocking socket operations.
This patch removes all signal usage and uses a socket shutdown instead.
The socket descriptor itself is cleared later when the whole nbd device
is closed.
The tasks_lock is removed as we do not depend on this anymore. Instead
a new lock for the socket is introduced so we can safely work with the
socket in the timeout handler outside of the two main threads.
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Markus Pargmann <mpa@pengutronix.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Currently we don't allow sync workload of one cgroup to preempt sync
workload of any other cgroup. This is because we want to achieve service
separation between cgroups. However in cases where cgroup preempting is
ancestor of the current cgroup, there is no need of separation and
idling introduces unnecessary overhead. This hurts for example the case
when workload is isolated within a cgroup but journalling threads are in
root cgroup. Simple way to demostrate the issue is using:
dbench4 -c /usr/share/dbench4/client.txt -t 10 -D /mnt 1
on ext4 filesystem on plain SATA drive (mounted with barrier=0 to make
difference more visible). When all processes are in the root cgroup,
reported throughput is 153.132 MB/sec. When dbench process gets its own
blkio cgroup, reported throughput drops to 26.1006 MB/sec.
Fix the problem by making check in cfq_should_preempt() more benevolent
and allow preemption by ancestor cgroup. This improves the throughput
reported by dbench4 to 48.9106 MB/sec.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
The original idea with preemption of sync noidle queues (introduced in
commit 718eee0579 "cfq-iosched: fairness for sync no-idle queues") was
that we service all sync noidle queues together, we don't idle on any of
the queues individually and we idle only if there is no sync noidle
queue to be served. This intention also matches the original test:
if (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
&& new_cfqq->service_tree == cfqq->service_tree)
return true;
However since at that time cfqq->service_tree was not set for idling
queues, this test was unreliable and was replaced in commit e4a229196a
"cfq-iosched: fix no-idle preemption logic" by:
if (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD &&
cfqq_type(new_cfqq) == SYNC_NOIDLE_WORKLOAD &&
new_cfqq->service_tree->count == 1)
return true;
That was a reliable test but was actually doing something different -
now we preempt sync noidle queue only if the new queue is the only one
busy in the service tree.
These days cfq queue is kept in service tree even if it is idling and
thus the original check would be safe again. But since we actually check
that cfq queues are in the same cgroup, of the same priority class and
workload type (sync noidle), we know that new_cfqq is fine to preempt
cfqq. So just remove the service tree check.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
Move check for preemption by rt class up. There is no functional change
but it makes arguing about conditions simpler since we can be sure both
cfq queues are from the same ioprio class.
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>