Commit Graph

136 Commits

Author SHA1 Message Date
Javier González
cca87bc9d3 lightnvm: do not assume sequential lun alloc.
When doing GC, rrpc calculates the physical LUN to which the rrpc block
belongs too. This calculation is based on the assumption that LUNs are
assigned sequentially to the LUN list. Use the reference to the LUN
instead. This saves us the calculation and allows us to align LUNs in a
different manner to, for example, take advantage of devide parallelism.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Javier González
75b8564932 lightnvm: rename dma helper functions
Until now, the dma pool have been exclusively used to allocate the ppa
list being sent to the device. In pblk (upcoming), we use these pools to
allocate metadata too. Thus, we generalize the names of some variables
on the dma helper functions to make the code more readable.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Javier González
57682b4915 lightnvm: do not free unused metadata on rrpc
rrpc does not save any metadata on a given request. Thus, do not attempt
to free the metadata dma region.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
293a6e8e27 lightnvm: fix out of bound ppa lun id on bb tbl
The ppa configured for retrieving the bad block table uses the internal
lun id to setup the get bad block ppa. This increases monotonically
with the number luns available. When configuring a ppa, the channel and
lun must be specified separately, leading to an out of bound memory
access in gennvm_block_bb when lun id goes beyond the luns available
within a channel.

Additional, remove out of bound check in gennvm_block_bb(), as it was a
buggy to begin with.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
00ee6cc3b7 lightnvm: refactor set_bb_tbl for accepting ppa list
The set_bb_tbl takes struct nvm_rq and only uses its ppa_list and
nr_pages internally. Instead, make these two variables explicit.
This allows a user to call it without initializing a struct nvm_rq
first.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
a63d5cf203 lightnvm: move responsibility for bad blk mgmt to target
We move the responsibility of managing the persistent bad block table to
the target. The target may choose to mark a block bad or retry writing
to it. Never the less, it should be the target that makes the decision
and not the media manager.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
5ebc7d9fe1 lightnvm: make nvm_set_rqd_ppalist() aware of vblks
A virtual block enables a block to identify multiple physical blocks.
This is useful for metadata where a device media supports multiple
planes. In that case, a block, with multiple planes can be managed
as a single vblk. Reducing the metadata required by one forth.

nvm_set_rqd_ppalist() takes care of expanding a ppa_list with vblks
automatically. However, for some use-cases, where only a single physical
block is required, the ppa_list should not be expanded.

Therefore, add a vblk parameter to nvm_set_rqd_ppalist(), and only
expand the ppa_list if vblk is set.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
6659d4d80c lightnvm: remove struct factory_blks
Now that device ops->get_bb_table no longer uses a callback, the
struct factory_blks can be removed.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
e11903f5df lightnvm: refactor device ops->get_bb_tbl()
The device ops->get_bb_tbl() takes a callback, that allows the caller
to use its own callback function to update its data structures in the
returning function.

This makes it difficult to send parameters to the callback, and usually
is circumvented by small private structures, that both carry the callers
state and any flags needed to fulfill the update.

Refactor ops->get_bb_tbl() to fill a data buffer with the status of the
blocks returned, and let the user call the callback function manually.
That will provide the necessary flags and data structures and simplify
the logic around ops->get_bb_tbl().

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
5136061ce7 lightnvm: introduce nvm_for_each_lun_ppa() macro
Users that wish to iterate all luns on a device. Must create a
struct ppa_addr and separate iterators for channels and luns. To set the
iterators, two loops are required, one to iterate channels, and another
to iterate luns. This leads to decrease in readability.

Introduce nvm_for_each_lun_ppa, which implements the nested loop and
sets ppa, channel, and lun variable for each loop body, eliminating
the boilerplate code.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Simon A. F. Lund
6f8645cba5 lightnvm: refactor dev->online_target to global nvm_targets
A target name must be unique. However, a per-device registration of
targets is maintained on a dev->online_targets list, with a per-device
search for targets upon registration.

This results in a name collision when two targets, with the same name,
are created on two different targets, where the per-device list is not
shared.

Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Simon A. F. Lund
6063fe399d lightnvm: rename nvm_targets to nvm_tgt_type
The functions nvm_register_target(), nvm_unregister_target() and
associated list refers to a target type that is being registered by a
target type module. Rename nvm_*_targets() to nvm_*_tgt_type(), so that
the intension is clear.

This enables target instances to use the _nvm_*_targets() naming.

Signed-off-by: Simon A. F. Lund <slund@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Wenwei Tao
909049a719 lightnvm: store rrpc->soffset in device sector size
Since we mainly use soffset in device sector size, we therefore store
this value in rrpc->soffset, instead of the offset in 512byte sector
size. This eliminates the "(ilog2(dev->sec_size) - 9)" calculation on
each I/O.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Updated patch description.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Wenwei Tao
66e3d07f75 lightnvm: calculate rrpc total blocks and sectors up front
Calculate rrpc total blocks and sectors up front, make sense
to use them. For example, we use rrpc->nr_sects to calculate rrpc
area size, but it makes no sense if we don't initialize it up front,
since it would be zero until we finish rrpc luns init.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
7f7c5d03c0 lightnvm: avoid memory leak when lun_map kcalloc fails
A memory leak occurs if the lower page table is initialized and the
following dev->lun_map fails on allocation.

Rearrange the initialization of lower page table to allow dev->lun_map
to fail gracefully without memory leak.

Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Move kfree of dev->lun_map to nvm_free()
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
22e8c9766a lightnvm: move block fold outside of get_bb_tbl()
The get block table command returns a list of blocks and planes
with their associated state. Users, such as gennvm and sysblk,
manages all planes as a single virtual block.

It was therefore  natural to fold the bad block list before it is
returned. However, to allow users, which manages on a per-plane
block level, to also use the interface, the get_bb_tbl interface is
changed to not fold by default and instead let the caller fold if
necessary.

Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
4891d120b9 lightnvm: add fpg_size and pfpg_size to struct nvm_dev
The flash page size (fpg) and size across planes (pfpg) are convenient
to know when allocating buffer sizes. This has previously been a
calculated in various places. Replace with the pre-calculated values.

Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
1145e6351a lightnvm: implement nvm_submit_ppa_list
The nvm_submit_ppa function assumes that users manage all plane
blocks as a single block. Extend the API with nvm_submit_ppa_list
to allow the user to send its own ppa list. If the user submits more
than a single PPA, the user must take care to allocate and free
the corresponding ppa list.

Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Matias Bjørling
ecfb40c6aa lightnvm: handle submit_io failure
The device ->submit_io() callback might fail to submit I/O to device.
In that case, the nvm_submit_ppa function should not wait for
completion. Instead return the ->submit_io() error.

Reviewed by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Jeff Mahoney
57aac2f1be lightnvm: fix "warning: ‘ret’ may be used uninitialized"
This fixes the following warnings:
drivers/lightnvm/sysblk.c:125:9: warning: ‘ret’ may be used
uninitialized in this function

drivers/lightnvm/sysblk.c:275:15: warning: ‘ret’ may be used
uninitialized in this function

In both cases, ret is only set from within a loop that may not be entered.

Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-05-06 12:51:10 -06:00
Javier González
29fd20b8e6 lightnvm: do not load L2P table if not supported
An Open-Channel SSD can work on two modes: (i) hybrid mode, where the
L2P table is maintained both by the host and by the device; and (ii)
full host-based, where the L2P table is uniquely maintained by the host.

In the advent of a new target implementing the full host-based mode, do
not assume that the L2P table must be loaded on the generic media
manager; check device properties loaded on the identify command instead.

Signed-off-by: Javier González <javier@cnexlabs.com>
Moved into the following statement.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-18 18:10:38 -07:00
Javier González
719b59172c lightnvm: do not reserve lun on l2p loading
When the l2p table is loaded, addresses are checked for the lun they
belong to and luns are reserved accordingly. This assumes that metadata
is being stored in the backend device to recover the previous target
configuration. Since this is not yet implemented, this check collides
with some of the core initialization (e.g., sysblock initialization when
a page is formed by several sectors).

We take this check out and for now rely on that the right target will be
created instead. When metadata is stored to recover a target, this check
will come natural as part of the recovery strategy.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-18 18:10:38 -07:00
Wenwei Tao
da1e284919 lightnvm: add a bitmap of luns
Add a bitmap of luns to indicate the status
of luns: inuse/available. When create targets
do the necessary check to avoid allocating luns
that are already allocated.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Freed dev->lun_map if nvm_core_init later failed in the init process.
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-18 18:10:38 -07:00
Wenwei Tao
4c9dacb82d lightnvm: specify target's logical address area
We can create more than one target on a lightnvm
device by specifying its begin lun and end lun.

But only specify the physical address area is not
enough, we need to get the corresponding non-
intersection logical address area division from
the backend device's logcial address space.
Otherwise the targets on the device might use
the same logical addresses cause incorrect
information in the device's l2p table.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-18 18:10:37 -07:00
Linus Torvalds
237045fc3c Merge branch 'for-4.6/drivers' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
 "This is the block driver pull request for this merge window.  It sits
  on top of for-4.6/core, that was just sent out.

  This contains:

   - A set of fixes for lightnvm.  One from Alan, fixing an overflow,
     and the rest from the usual suspects, Javier and Matias.

   - A set of fixes for nbd from Markus and Dan, and a fixup from Arnd
     for correct usage of the signed 64-bit divider.

   - A set of bug fixes for the Micron mtip32xx, from Asai.

   - A fix for the brd discard handling from Bart.

   - Update the maintainers entry for cciss, since that hardware has
     transferred ownership.

   - Three bug fixes for bcache from Eric Wheeler.

   - Set of fixes for xen-blk{back,front} from Jan and Konrad.

   - Removal of the cpqarray driver.  It has been disabled in Kconfig
     since 2013, and we were initially scheduled to remove it in 3.15.

   - Various updates and fixes for NVMe, with the most important being:

        - Removal of the per-device NVMe thread, replacing that with a
          watchdog timer instead. From Christoph.

        - Exposing the namespace WWID through sysfs, from Keith.

        - Set of cleanups from Ming Lin.

        - Logging the controller device name instead of the underlying
          PCI device name, from Sagi.

        - And a bunch of fixes and optimizations from the usual suspects
          in this area"

* 'for-4.6/drivers' of git://git.kernel.dk/linux-block: (49 commits)
  NVMe: Expose ns wwid through single sysfs entry
  drivers:block: cpqarray clean up
  brd: Fix discard request processing
  cpqarray: remove it from the kernel
  cciss: update MAINTAINERS
  NVMe: Remove unused sq_head read in completion path
  bcache: fix cache_set_flush() NULL pointer dereference on OOM
  bcache: cleaned up error handling around register_cache()
  bcache: fix race of writeback thread starting before complete initialization
  NVMe: Create discard zero quirk white list
  nbd: use correct div_s64 helper
  mtip32xx: remove unneeded variable in mtip_cmd_timeout()
  lightnvm: generalize rrpc ppa calculations
  lightnvm: remove struct nvm_dev->total_blocks
  lightnvm: rename ->nr_pages to ->nr_sects
  lightnvm: update closed list outside of intr context
  xen/blback: Fit the important information of the thread in 17 characters
  lightnvm: fold get bb tbl when using dual/quad plane mode
  lightnvm: fix up nonsensical configure overrun checking
  xen-blkback: advertise indirect segment support earlier
  ...
2016-03-18 17:13:31 -07:00
Javier González
afb18e0ed8 lightnvm: generalize rrpc ppa calculations
In rrpc, some calculations assume a certain configuration (e.g., 1 LUN,
1 sector per page). The reason behind this was that LightNVM used a
simple configuration with QEMU to test core features in the beginning.
This patch relaxes these assumptions and generalizes calculation,
allowing multiple luns to be used.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:47:53 -07:00
Matias Bjørling
ed2a92a6b4 lightnvm: remove struct nvm_dev->total_blocks
The struct nvm_dev->total_blocks was only used for calculating total
sectors. Remove and instead calculate total sectors from the number of
luns and its sectors.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:46:35 -07:00
Matias Bjørling
4ece44af73 lightnvm: rename ->nr_pages to ->nr_sects
The struct rrpc->nr_pages can easily be interpreted as the number of
flash pages allocated to rrpc, while it is the nr_sects. Make sure that
this is reflected from the variable name.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:46:35 -07:00
Javier González
6adb03de40 lightnvm: update closed list outside of intr context
When an I/O finishes, full blocks are moved from the open to the closed
list - a lock is taken to protect the list. This happens at the moment
in the interrupt context, which is not correct.

This patch moves this logic to the block workqueue instead, avoiding
holding a spinlock without interrupt save in an interrupt context.

Signed-off-by: Javier González <javier@cnexlabs.com>
Fixes: ff0e498bfa ("lightnvm: manage open and closed blocks sepa...")
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:46:35 -07:00
Matias Bjørling
d5bdec8ddb lightnvm: fold get bb tbl when using dual/quad plane mode
When the media manager runs in dual or quad plane mode, lightnvm
abstracts away plane specific commands. This poses a problem for
get bad block table, as it reports bad blocks per plane, making the
table either two or four times bigger than expected. Fold the bad block
list before returning.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:45:53 -07:00
Alan
5e422cffe8 lightnvm: fix up nonsensical configure overrun checking
Instead of checking a constant 0 actually check the space available. Even
better remember to allow for the header and also check the right amount of
space is needed.

Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-03-03 14:45:53 -07:00
Matias Bjørling
bf64318564 lightnvm: allow to force mm initialization
System block allows the device to initialize with its configured media
manager. The system blocks is written to disk, and read again when media
manager is determined. For this to work, the backend must store the
data. Device drivers, such as null_blk, does not have any backend
storage. This patch allows the media manager to be initialized without a
storage backend.

It also fix incorrect configuration of capabilities in null_blk, as it
does not support get/set bad block interface.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-02-04 09:19:45 -07:00
Javier González
3704e098cc lightnvm: fix request intersection locking in rrpc
This patch fixes an error on the calculation of intersecting logical
addresses; it contemplates the case where a new request including
several addresses intersects with a single locked address. This case is
typical when multiple pages are sent in a new request, while GC - which
at the moment sends one address at the time - is running.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-02-04 09:19:45 -07:00
Javier González
bba7f40a02 lightnvm: warn if irqs are disabled in lock laddr
Add a warning if irqs are disabled when locking a new address in rrpc.
The typical path to a new request does not disable irqs, but this is not
guaranteed in the future.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-02-04 09:19:45 -07:00
Wenwei Tao
16c6d048d7 lightnvm: put bio before return
The bio is not returned if the data page cannot be allocated.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-02-04 09:19:45 -07:00
Matias Bjørling
8b4970c41f lightnvm: introduce factory reset
Now that a device can be managed using the system blocks, a method to
reset the device is necessary as well. This patch introduces logic to
reset the device easily to factory state and exposes it through an
ioctl.

The ioctl takes the following flags:

  NVM_FACTORY_ERASE_ONLY_USER
      By default all blocks, except host-reserved blocks are erased upon
      factory reset. Instead of this, only erase host-reserved blocks.
  NVM_FACTORY_RESET_HOST_BLKS
      Mark host-reserved blocks to be erased and set their type to free.
  NVM_FACTORY_RESET_GRWN_BBLKS
      Mark "grown bad blocks" to be erased and set their type to free.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:18 -07:00
Matias Bjørling
b769207678 lightnvm: use system block for mm initialization
Use system block information to register the appropriate media manager.
This enables the LightNVM subsystem to instantiate a media manager
selected by the user, instead of relying on automatic detection by each
media manager loaded in the kernel.

A device must now be initialized before it can proceed to initialize its
media manager. Upon initialization, the configured media manager is
automatically initialized as well.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:18 -07:00
Matias Bjørling
5569615424 lightnvm: introduce ioctl to initialize device
Based on the previous patch, we now introduce an ioctl to initialize the
device using nvm_init_sysblock and create the necessary system blocks.
The user may specify the media manager that they wish to instantiate on
top. Default from user-space will be "gennvm".

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:18 -07:00
Matias Bjørling
e3eb3799f7 lightnvm: core on-disk initialization
An Open-Channel SSD shall be initialized before use. To initialize, we
define an on-disk format, that keeps a small set of metadata to bring up
the media manager on top of the device.

The initial step is introduced to allow a user to format the disks for a
given media manager. During format, a system block is stored on one to
three separate luns on the device. Each lun has the system block
duplicated. During initialization, the system block can be retrieved and
the appropriate media manager can initialized.

The on-disk format currently covers (struct nvm_system_block):

 - Magic value "NVMS".
 - Monotonic increasing sequence number.
 - The physical block erase count.
 - Version of the system block format.
 - Media manager type.
 - Media manager superblock physical address.

The interface provides three functions to manage the system block:

 int nvm_init_sysblock(struct nvm_dev *, struct nvm_sb_info *)
 int nvm_get_sysblock(struct nvm *dev, struct nvm_sb_info *)
 int nvm_update_sysblock(struct nvm *dev, struct nvm_sb_info *)

Each implement a part of the logic to manage the system block. The
initialization creates the first system blocks and mark them on the
device. Get retrieves the latest system block by scanning all pages in
the associated system blocks. The update sysblock writes new metadata
and allocates new block if necessary.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:18 -07:00
Matias Bjørling
ca5927e7ab lightnvm: introduce mlc lower page table mappings
NAND MLC memories have both lower and upper pages. When programming,
both of these must be written, before data can be read. However,
these lower and upper pages might not placed at even and odd flash
pages, but can be skipped. Therefore each flash memory has its lower
pages defined, which can then be used when programming and to know when
padding are necessary.

This patch implements the lower page definition in the specification,
and exposes it through a simple lookup table at dev->lptbl.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Matias Bjørling
f9a9995072 lightnvm: add mccap support
Some flash media has extended capabilities, such as programming SLC
pages on MLC/TLC flash, erase/program suspend, scramble and encryption.
MCCAP is introduced to detect support for these capabilities in the
command set.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Javier González
ff0e498bfa lightnvm: manage open and closed blocks separately
LightNVM targets need to know the state of the flash block when doing
flash optimizations. An example is implementing a write buffer to
respect the flash page size. Currently, block state is not accounted
for; the media manager only differentiates among free, bad and in-use
blocks.

This patch adds the logic in the generic media manager to enable
targets manage blocks into open and close separately, and it implements
such management in rrpc. It also adds a set of flags to describe the
state of the block (open, closed, free, bad).

In order to avoid taking two locks (nvm_lun and rrpc_lun) consecutively,
we introduce lockless get_/put_block primitives so that the open and
close list locks and future common logic is handled within the nvm_lun
lock.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Javier González
d7a64d275b lightnvm: reference rrpc lun in rrpc block
Currently, a rrpc block only points to its nvm_lun. If a user wants to
find the associated rrpc lun, it will have to calculate the index and
look it up manually. By referencing the rrpc lun directly, this step can
be omitted, at the cost of a larger memory footprint.

This is important for upcoming patches that implement write buffering in
rrpc.

Signed-off-by: Javier González <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Matias Bjørling
09719b62fd lightnvm: introduce nvm_submit_ppa
Internal logic for both core and media managers, does not have a
backing bio for issuing I/Os. Introduce nvm_submit_ppa to allow raw
I/Os to be submitted to the underlying device driver.

The function request the device, ppa, data buffer and its length and
will submit the I/O synchronously to the device. The return value may
therefore be used to detect any errors regarding the issued I/O.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Matias Bjørling
72d256ecc5 lightnvm: move rq->error to nvm_rq->error
Instead of passing request error into the LightNVM modules, incorporate
it into the nvm_rq.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Matias Bjørling
81e681d3f7 lightnvm: support multiple ppas in nvm_erase_ppa
Sometimes a user want to erase multiple PPAs at the same time. Extend
nvm_erase_ppa to take multiple ppas and number of ppas to be erased.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Wenwei Tao
4b79beb4c3 lightnvm: move the pages per block check out of the loop
There is no need to check whether dev's pages per block is
beyond rrpc support every time we init a lun, we only need
to check it once before enter the lun init loop.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:17 -07:00
Matias Bjørling
556755e941 lightnvm: sectors first in ppa list
The Westlake controller requires that the PPA list has sectors defined
sequentially. Currently, the PPA list is created with planes first, then
sectors. Change this to sectors first, then planes.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Wenwei Tao
b262924be0 lightnvm: fix locking and mempool in rrpc_lun_gc
This patch fix two issues in rrpc_lun_gc

1. prio_list is protected by rrpc_lun's lock not nvm_lun's, so
acquire rlun's lock instead of lun's before operate on the list.

2. we delete block from prio_list before allocating gcb, but gcb
allocation may fail, we end without putting it back to the list,
this makes the block won't get reclaimed in the future. To solve
this issue, delete block after gcb allocation.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Wenwei Tao
d0ca798f96 lightnvm: put block back to gc list on its reclaim fail
We delete a block from the gc list before reclaim it, so
put it back to the list on its reclaim fail, otherwise
this block will not get reclaimed and be programmable
in the future.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Wenwei Tao
2b11c1b24e lightnvm: check bi_error in gc
We should check last io completion status before
starting another one.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Matias Bjørling
91276162de lightnvm: refactor end_io functions for sync
To implement sync I/O support within the LightNVM core, the end_io
functions are refactored to take an end_io function pointer instead of
testing for initialized media manager, followed by calling its end_io
function.

Sync I/O can then be implemented using a callback that signal I/O
completion. This is similar to the logic found in blk_to_execute_io().
By implementing it this way, the underlying device I/Os submission logic
is abstracted away from core, targets, and media managers.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Matias Bjørling
abd805ec9f lightnvm: refactor rqd ppa list into set/free
A device may be driven in single, double or quad plane mode. In that
case, the rqd must have either one, two, or four PPAs set for a single
PPA sent to the device. Refactor this logic into their own
functions to be shared by program/erase/read in the core.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Matias Bjørling
069368e918 lightnvm: move ppa erase logic to core
A device may function in single, dual or quad plane mode. The gennvm
media manager manages this with explicit helpers. They convert a single
ppa to 1, 2 or 4 separate ppas in a ppa list. To aid implementation of
recovery and system blocks, this functionality can be moved directly
into the core.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Wenwei Tao
c27278bddd lightnvm: unlock rq and free ppa_list on submission fail
When rrpc_write_ppalist_rq and rrpc_read_ppalist_rq succeed, we setup
rq correctly, but nvm_submit_io may afterward fail since it cannot
allocate request or nvme_nvm_command, we return error but forget to
cleanup the previous work.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Javier Gonzalez
3bfbc6adbc lightnvm: add check after mempool allocation
The mempool allocation might fail. Make sure to return error when it
does, instead of causing a kernel panic.

Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:16 -07:00
Chao Yu
bdded15520 lightnvm: fix incorrect nr_free_blocks stat
When initing bad block list in gennvm_block_bb, once we move bad block
from free_list to bb_list, we should maintain both stat info
nr_free_blocks and nr_bad_blocks. So this patch fixes to add missing
operation related to nr_free_blocks.

Signed-off-by: Chao Yu <chao2.yu@samsung.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:15 -07:00
Wenwei Tao
3cd485b1f8 lightnvm: fix bio submission issue
Put bio when submission fails, since we get it
before submission. And return error when backend
device driver doesn't provide a submit_io method,
thus we can end IO properly.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-01-12 08:21:15 -07:00
Matias Bjørling
c3293a9ac2 lightnvm: wrong offset in bad blk lun calculation
dev->nr_luns reports the total number of luns available in a device
while dev->luns_per_chnl is the number of luns per channel.

When multiple channels are available, the offset is calculated from a
channel and lun id into a linear array. As it multiplies with
the total number of luns, we go out of bound when channel id > 0 and
causes the kernel to panic when we read a protected kernel memory area.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-29 08:28:32 -07:00
Matias Bjørling
4158624454 lightnvm: do not compile in debugging by default
The LightNVM module exposes a debug interface when CONFIG_NVM_DEBUG is
set. This interfaces takes a string to configure media managers and
targets. Make sure this interface is only exposed when chosen
deliberately.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:20 -07:00
Matias Bjørling
008b744382 lightnvm: prevent gennvm module unload on use
After the gennvm module has been initialized. It might be attached to
one or several devices. In that case, the module is in use. Make sure
that it can not be unloaded.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Matias Bjørling
762796bc9e lightnvm: fix media mgr registration
This patch fixes two issues during media manager registration.

1. The ppa pool can be used at media manager registration. Allocate the
ppa pool before that.

2. If a media manager can't be found, this should not lead to the
device being unallocated. A media manager can be registered later, that
can manage the device. Only warn if a media manager fails
initialization.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Matias Bjørling
16f26c3aa9 lightnvm: replace req queue with nvmdev for lld
In the case where a request queue is passed to the low lever lightnvm
device drive integration, the device driver might pass its admin
commands through another queue. Instead pass nvm_dev, and let the
low level drive the appropriate queue.

Reported-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Wenwei Tao
e9b76a80f1 lightnvm: refactor spin_unlock in gennvm_get_blk
The spin_unlock is duplicated multiple times. Jump to a single unlock
to improve the code flow.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Wenwei Tao
d3d1a43842 lightnvm: put blks when luns configure failed
Put the allocated blocks back to the free list
when the luns configure failed, to make these
blocks useable to others.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Wenwei Tao
f27a629953 lightnvm: use flags in rrpc_get_blk
rrpc_get_blk use constant 0 as the input parameter
of nvm_get_blk, this may result in getting gc block
failed unexpectedly.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-12-07 09:14:19 -07:00
Wenwei Tao
d0a712ceb8 lightnvm: missing nvm_lock acquire
To avoid race conditions, traverse dev, media manager,
and target lists and also register, unregister entries
to/from them, should be always under the nvm_lock control.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-29 14:34:58 -07:00
Matias Bjørling
08236c6bb2 lightnvm: unconverted ppa returned in get_bb_tbl
The get_bb_tbl function takes ppa as a generic address, which is
converted to the ppa device address within the device driver. When
the update_bbtbl callback is called from get_bb_tbl, the device
specific ppa is used, instead of the generic ppa.

Make sure to pass the generic ppa.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-29 14:34:58 -07:00
Wenwei Tao
d160147b5c lightnvm: do device max sectors boundary check first
do device max_phys_sect boundary check first, otherwise
we will allocate dma_pools for devices whose max sectors
are beyond lightnvm support and register them.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-29 14:34:58 -07:00
Sudip Mukherjee
76e25081b6 lightnvm: fix ioctl memory leaks
If copy_to_user() fails we returned error but we missed releasing
devices.

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-29 14:34:57 -07:00
Wenwei Tao
8261bd48c6 lightnvm: free memory when gennvm register fails
free allocated nvm block and gennvm lun structures when
gennvm register fails, otherwise it will cause memory leak.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-29 14:34:57 -07:00
Javier Gonzalez
2fde0e482d lightnvm: add free and bad lun info to show luns
Add free block, used block, and bad block information to the show debug
interface. This information is used to debug how targets track blocks.

Also, change debug function name to make it more generic.

Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-20 08:33:21 -07:00
Javier Gonzalez
0b59733b95 lightnvm: keep track of block counts
Maintain number of in use blocks, free blocks, and bad blocks in a per
lun basis. This allows the upper layers to get information about the
state of each lun.

Also, account for blocks reserved to the device on the free block count.
nr_free_blocks matches now the actual number of blocks on the free list
when the device is booted.

Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-20 08:33:20 -07:00
Matias Bjørling
93e70c1f28 lightnvm: missing free on init error
If either max_phys_sect is out of bound, the nvm_dev structure is not
freed.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-20 08:33:16 -07:00
Wenwei Tao
480fc0db81 lightnvm: wrong return value and redundant free
The return value should be non-zero under error conditions.
Remove nvme_free(dev) to avoid free dev more than once.

Signed-off-by: Wenwei Tao <ww.tao0320@gmail.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-20 08:33:14 -07:00
Javier González
d09f9581b2 lightnvm: cleanup queue before target removal
This prevents outstanding IOs to be sent for completion to target after
the target has been removed. The flow is now: stop new IOs > cleanup
queue > remove target.

Signed-off-by: Javier Gonzalez <javier@cnexlabs.com>
Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-16 15:20:41 -07:00
Matias Bjørling
7386af270c lightnvm: remove linear and device addr modes
The linear and device specific address modes can be replaced with a
simple offset and bit length conversion that is generic across all
devices.

This both simplifies the specification and removes the special case for
qemu nvme, that previously relied on the linear address mapping.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-16 15:20:34 -07:00
Matias Bjørling
c1480ad594 lightnvm: prevent double free on init error
Both the nvm_register and nvm_init does a kfree(dev) on error. Make sure
to only free it once.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-16 15:20:33 -07:00
Matias Bjørling
edad2e6606 lightnvm: prematurely activate nvm_dev
We register with nvm_devices when there registration can still fail.
Move the final registration at the end of the nvm_register function
to make sure we are fully registered when added to the nvm_devices list.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-16 15:20:31 -07:00
Matias Bjørling
4264c980e3 lightnvm: check for NAND flash and its type
Only NAND flash with SLC and MLC is supported. Make sure to not try to
initialize TLC memory or other non-volatile memory types.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-16 15:20:30 -07:00
Matias Bjørling
1145046983 lightnvm: update bad block table format
The specification was changed to reflect a multi-value bad block table.
Instead of bit-based bad block table, the bad block table now allows
eight bad block categories. Currently four are defined:

 * Factory bad blocks
 * Grown bad blocks
 * Device-side reserved blocks
 * Host-side reserved blocks

The factory and grown bad blocks are the regular bad blocks. The
reserved blocks are either for internal use or external use. In
particular, the device-side reserved blocks allows the host to
bootstrap from a limited number of flash blocks. Reducing the flash
blocks to scan upon super block initialization.

Support for both get bad block table and set bad block table is added.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-16 15:20:25 -07:00
Jens Axboe
dece16353e block: change ->make_request_fn() and users to return a queue cookie
No functional changes in this patch, but it prepares us for returning
a more useful cookie related to the IO that was queued up.

Signed-off-by: Jens Axboe <axboe@fb.com>
Acked-by: Christoph Hellwig <hch@lst.de>
Acked-by: Keith Busch <keith.busch@intel.com>
2015-11-07 10:40:46 -07:00
Matias Bjørling
b7ceb7d500 lightnvm: refactor phys addrs type to u64
For cases where CONFIG_LBDAF is not set. The struct ppa_addr exceeds its
type on 32 bit architectures. ppa_addr requires a 64bit integer to hold
the generic ppa format. We therefore refactor it to u64 and
replaces the sector_t usages with u64 for physical addresses.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-11-03 09:53:24 -07:00
Matias Bjørling
ae1519ec44 rrpc: Round-robin sector target with cost-based gc
This target allows an Open-Channel SSD to be exposed asas a block
device.

It implements a round-robin approach for sector allocation,
together with a greedy cost-based garbage collector.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-29 16:21:42 +09:00
Matias Bjørling
48add0f5a6 gennvm: Generic NVM manager
The implementation for Open-Channel SSDs is divided into media
management and targets. This patch implements a generic media manager
for open-channel SSDs. After a media manager has been initialized,
single or multiple targets can be instantiated with the media managed as
the backend.

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-29 16:21:42 +09:00
Matias Bjørling
cd9e9808d1 lightnvm: Support for Open-Channel SSDs
Open-channel SSDs are devices that share responsibilities with the host
in order to implement and maintain features that typical SSDs keep
strictly in firmware. These include (i) the Flash Translation Layer
(FTL), (ii) bad block management, and (iii) hardware units such as the
flash controller, the interface controller, and large amounts of flash
chips. In this way, Open-channels SSDs exposes direct access to their
physical flash storage, while keeping a subset of the internal features
of SSDs.

LightNVM is a specification that gives support to Open-channel SSDs
LightNVM allows the host to manage data placement, garbage collection,
and parallelism. Device specific responsibilities such as bad block
management, FTL extensions to support atomic IOs, or metadata
persistence are still handled by the device.

The implementation of LightNVM consists of two parts: core and
(multiple) targets. The core implements functionality shared across
targets. This is initialization, teardown and statistics. The targets
implement the interface that exposes physical flash to user-space
applications. Examples of such targets include key-value store,
object-store, as well as traditional block devices, which can be
application-specific.

Contributions in this patch from:

  Javier Gonzalez <jg@lightnvm.io>
  Dongsheng Yang <yangds.fnst@cn.fujitsu.com>
  Jesper Madsen <jmad@itu.dk>

Signed-off-by: Matias Bjørling <m@bjorling.me>
Signed-off-by: Jens Axboe <axboe@fb.com>
2015-10-29 16:21:42 +09:00