Commit Graph

716 Commits

Author SHA1 Message Date
Rabin Vincent
8a1435880f ubi: fastmap: Fix slab corruption
Booting with UBI fastmap and SLUB debugging enabled results in the
following splats.  The problem is that ubi_scan_fastmap() moves the
fastmap blocks from the scan_ai (allocated in scan_fast()) to the ai
allocated in ubi_attach().  This results in two problems:

 - When the scan_ai is freed, aebs which were allocated from its slab
   cache are still in use.

 - When the other ai is being destroyed in destroy_ai(), the
   arguments to kmem_cache_free() call are incorrect since aebs on its
   ->fastmap list were allocated with a slab cache from a differnt ai.

Fix this by making a copy of the aebs in ubi_scan_fastmap() instead of
moving them.

 =============================================================================
 BUG ubi_aeb_slab_cache (Not tainted): Objects remaining in ubi_aeb_slab_cache on __kmem_cache_shutdown()
 -----------------------------------------------------------------------------

 INFO: Slab 0xbfd2da3c objects=17 used=1 fp=0xb33d7748 flags=0x40000080
 CPU: 1 PID: 118 Comm: ubiattach Tainted: G    B           4.9.15 #3
 [<80111910>] (unwind_backtrace) from [<8010d498>] (show_stack+0x18/0x1c)
 [<8010d498>] (show_stack) from [<804a3274>] (dump_stack+0xb4/0xe0)
 [<804a3274>] (dump_stack) from [<8026c47c>] (slab_err+0x78/0x88)
 [<8026c47c>] (slab_err) from [<802735bc>] (__kmem_cache_shutdown+0x180/0x3e0)
 [<802735bc>] (__kmem_cache_shutdown) from [<8024e13c>] (shutdown_cache+0x1c/0x60)
 [<8024e13c>] (shutdown_cache) from [<8024ed64>] (kmem_cache_destroy+0x19c/0x20c)
 [<8024ed64>] (kmem_cache_destroy) from [<8057cc14>] (destroy_ai+0x1dc/0x1e8)
 [<8057cc14>] (destroy_ai) from [<8057f04c>] (ubi_attach+0x3f4/0x450)
 [<8057f04c>] (ubi_attach) from [<8056fe70>] (ubi_attach_mtd_dev+0x60c/0xff8)
 [<8056fe70>] (ubi_attach_mtd_dev) from [<80571d78>] (ctrl_cdev_ioctl+0x110/0x2b8)
 [<80571d78>] (ctrl_cdev_ioctl) from [<8029c77c>] (do_vfs_ioctl+0xac/0xa00)
 [<8029c77c>] (do_vfs_ioctl) from [<8029d10c>] (SyS_ioctl+0x3c/0x64)
 [<8029d10c>] (SyS_ioctl) from [<80108860>] (ret_fast_syscall+0x0/0x1c)
 INFO: Object 0xb33d7e88 @offset=3720
 INFO: Allocated in scan_peb+0x608/0x81c age=72 cpu=1 pid=118
 	kmem_cache_alloc+0x3b0/0x43c
 	scan_peb+0x608/0x81c
 	ubi_attach+0x124/0x450
 	ubi_attach_mtd_dev+0x60c/0xff8
 	ctrl_cdev_ioctl+0x110/0x2b8
 	do_vfs_ioctl+0xac/0xa00
 	SyS_ioctl+0x3c/0x64
 	ret_fast_syscall+0x0/0x1c
 kmem_cache_destroy ubi_aeb_slab_cache: Slab cache still has objects
 CPU: 1 PID: 118 Comm: ubiattach Tainted: G    B           4.9.15 #3
 [<80111910>] (unwind_backtrace) from [<8010d498>] (show_stack+0x18/0x1c)
 [<8010d498>] (show_stack) from [<804a3274>] (dump_stack+0xb4/0xe0)
 [<804a3274>] (dump_stack) from [<8024ed80>] (kmem_cache_destroy+0x1b8/0x20c)
 [<8024ed80>] (kmem_cache_destroy) from [<8057cc14>] (destroy_ai+0x1dc/0x1e8)
 [<8057cc14>] (destroy_ai) from [<8057f04c>] (ubi_attach+0x3f4/0x450)
 [<8057f04c>] (ubi_attach) from [<8056fe70>] (ubi_attach_mtd_dev+0x60c/0xff8)
 [<8056fe70>] (ubi_attach_mtd_dev) from [<80571d78>] (ctrl_cdev_ioctl+0x110/0x2b8)
 [<80571d78>] (ctrl_cdev_ioctl) from [<8029c77c>] (do_vfs_ioctl+0xac/0xa00)
 [<8029c77c>] (do_vfs_ioctl) from [<8029d10c>] (SyS_ioctl+0x3c/0x64)
 [<8029d10c>] (SyS_ioctl) from [<80108860>] (ret_fast_syscall+0x0/0x1c)
 cache_from_obj: Wrong slab cache. ubi_aeb_slab_cache but object is from ubi_aeb_slab_cache
 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 118 at mm/slab.h:354 kmem_cache_free+0x39c/0x450
 Modules linked in:
 CPU: 1 PID: 118 Comm: ubiattach Tainted: G    B           4.9.15 #3
 [<80111910>] (unwind_backtrace) from [<8010d498>] (show_stack+0x18/0x1c)
 [<8010d498>] (show_stack) from [<804a3274>] (dump_stack+0xb4/0xe0)
 [<804a3274>] (dump_stack) from [<80120e40>] (__warn+0xf4/0x10c)
 [<80120e40>] (__warn) from [<80120f20>] (warn_slowpath_null+0x28/0x30)
 [<80120f20>] (warn_slowpath_null) from [<80271fe0>] (kmem_cache_free+0x39c/0x450)
 [<80271fe0>] (kmem_cache_free) from [<8057cb88>] (destroy_ai+0x150/0x1e8)
 [<8057cb88>] (destroy_ai) from [<8057ef1c>] (ubi_attach+0x2c4/0x450)
 [<8057ef1c>] (ubi_attach) from [<8056fe70>] (ubi_attach_mtd_dev+0x60c/0xff8)
 [<8056fe70>] (ubi_attach_mtd_dev) from [<80571d78>] (ctrl_cdev_ioctl+0x110/0x2b8)
 [<80571d78>] (ctrl_cdev_ioctl) from [<8029c77c>] (do_vfs_ioctl+0xac/0xa00)
 [<8029c77c>] (do_vfs_ioctl) from [<8029d10c>] (SyS_ioctl+0x3c/0x64)
 [<8029d10c>] (SyS_ioctl) from [<80108860>] (ret_fast_syscall+0x0/0x1c)
 ---[ end trace 2bd8396277fd0a0b ]---
 =============================================================================
 BUG ubi_aeb_slab_cache (Tainted: G    B   W      ): page slab pointer corrupt.
 -----------------------------------------------------------------------------

 INFO: Allocated in scan_peb+0x608/0x81c age=104 cpu=1 pid=118
 	kmem_cache_alloc+0x3b0/0x43c
 	scan_peb+0x608/0x81c
 	ubi_attach+0x124/0x450
 	ubi_attach_mtd_dev+0x60c/0xff8
 	ctrl_cdev_ioctl+0x110/0x2b8
 	do_vfs_ioctl+0xac/0xa00
 	SyS_ioctl+0x3c/0x64
 	ret_fast_syscall+0x0/0x1c
 INFO: Slab 0xbfd2da3c objects=17 used=1 fp=0xb33d7748 flags=0x40000081
 INFO: Object 0xb33d7e88 @offset=3720 fp=0xb33d7da0

 Redzone b33d7e80: cc cc cc cc cc cc cc cc                          ........
 Object b33d7e88: 02 00 00 00 01 00 00 00 00 f0 ff 7f ff ff ff ff  ................
 Object b33d7e98: 00 00 00 00 00 00 00 00 bd 16 00 00 00 00 00 00  ................
 Object b33d7ea8: 00 01 00 00 00 02 00 00 00 00 00 00 00 00 00 00  ................
 Redzone b33d7eb8: cc cc cc cc                                      ....
 Padding b33d7f60: 5a 5a 5a 5a 5a 5a 5a 5a                          ZZZZZZZZ
 CPU: 1 PID: 118 Comm: ubiattach Tainted: G    B   W       4.9.15 #3
 [<80111910>] (unwind_backtrace) from [<8010d498>] (show_stack+0x18/0x1c)
 [<8010d498>] (show_stack) from [<804a3274>] (dump_stack+0xb4/0xe0)
 [<804a3274>] (dump_stack) from [<80271770>] (free_debug_processing+0x320/0x3c4)
 [<80271770>] (free_debug_processing) from [<80271ad0>] (__slab_free+0x2bc/0x430)
 [<80271ad0>] (__slab_free) from [<80272024>] (kmem_cache_free+0x3e0/0x450)
 [<80272024>] (kmem_cache_free) from [<8057cb88>] (destroy_ai+0x150/0x1e8)
 [<8057cb88>] (destroy_ai) from [<8057ef1c>] (ubi_attach+0x2c4/0x450)
 [<8057ef1c>] (ubi_attach) from [<8056fe70>] (ubi_attach_mtd_dev+0x60c/0xff8)
 [<8056fe70>] (ubi_attach_mtd_dev) from [<80571d78>] (ctrl_cdev_ioctl+0x110/0x2b8)
 [<80571d78>] (ctrl_cdev_ioctl) from [<8029c77c>] (do_vfs_ioctl+0xac/0xa00)
 [<8029c77c>] (do_vfs_ioctl) from [<8029d10c>] (SyS_ioctl+0x3c/0x64)
 [<8029d10c>] (SyS_ioctl) from [<80108860>] (ret_fast_syscall+0x0/0x1c)
 FIX ubi_aeb_slab_cache: Object at 0xb33d7e88 not freed

Signed-off-by: Rabin Vincent <rabinv@axis.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-05-08 20:48:33 +02:00
Andy Shevchenko
997d30cb74 ubi: Make mtd parameter readable
Fix permissions to allow read mtd parameter back (only for owner).

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-05-08 20:48:17 +02:00
Andy Shevchenko
435009d406 ubi: Fix section mismatch
WARNING: vmlinux.o(.text+0x1f2a80): Section mismatch in reference from the variable __param_ops_mtd to the function .init.text:ubi_mtd_param_parse()
The function __param_ops_mtd() references
the function __init ubi_mtd_param_parse().
This is often because __param_ops_mtd lacks a __init
annotation or the annotation of ubi_mtd_param_parse is wrong.

Cc: Richard Weinberger <richard@nod.at>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-05-08 20:47:59 +02:00
Linus Torvalds
044f1daaaa Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes and updates from Jens Axboe:
 "Some fixes and followup features/changes that should go in, in this
  merge window. This contains:

   - Two fixes for lightnvm from Javier, fixing problems in the new code
     merge previously in this merge window.

   - A fix from Jan for the backing device changes, fixing an issue in
     NFS that causes a failure to mount on certain setups.

   - A change from Christoph, cleaning up the blk-mq init and exit
     request paths.

   - Remove elevator_change(), which is now unused. From Bart.

   - A fix for queue operation invocation on a dead queue, from Bart.

   - A series fixing up mtip32xx for blk-mq scheduling, removing a
     bandaid we previously had in place for this. From me.

   - A regression fix for this series, fixing a case where we wait on
     workqueue flushing from an invalid (non-blocking) context. From me.

   - A fix/optimization from Ming, ensuring that we don't both quiesce
     and freeze a queue at the same time.

   - A fix from Peter on lock ordering for CPU hotplug. Not a real
     problem right now, but will be once the CPU hotplug rework goes in.

   - A series from Omar, cleaning up out blk-mq debugfs support, and
     adding support for exporting info from schedulers in debugfs as
     well. This is really useful in debugging stalls or livelocks. From
     Omar"

* 'for-linus' of git://git.kernel.dk/linux-block: (28 commits)
  mq-deadline: add debugfs attributes
  kyber: add debugfs attributes
  blk-mq-debugfs: allow schedulers to register debugfs attributes
  blk-mq: untangle debugfs and sysfs
  blk-mq: move debugfs declarations to a separate header file
  blk-mq: Do not invoke queue operations on a dead queue
  blk-mq-debugfs: get rid of a bunch of boilerplate
  blk-mq-debugfs: rename hw queue directories from <n> to hctx<n>
  blk-mq-debugfs: don't open code strstrip()
  blk-mq-debugfs: error on long write to queue "state" file
  blk-mq-debugfs: clean up flag definitions
  blk-mq-debugfs: separate flags with |
  nfs: Fix bdi handling for cloned superblocks
  block/mq: Cure cpu hotplug lock inversion
  lightnvm: fix bad back free on error path
  lightnvm: create cmd before allocating request
  blk-mq: don't use sync workqueue flushing from drivers
  mtip32xx: convert internal commands to regular block infrastructure
  mtip32xx: cleanup internal tag assumptions
  block: don't call blk_mq_quiesce_queue() after queue is frozen
  ...
2017-05-06 11:25:08 -07:00
Linus Torvalds
af82455f7d char/misc patches for 4.12-rc1
Here is the big set of new char/misc driver drivers and features for
 4.12-rc1.
 
 There's lots of new drivers added this time around, new firmware drivers
 from Google, more auxdisplay drivers, extcon drivers, fpga drivers, and
 a bunch of other driver updates.  Nothing major, except if you happen to
 have the hardware for these drivers, and then you will be happy :)
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWQvAgg8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+yknsACgzkAeyz16Z97J3UTaeejbR7nKUCAAoKY4WEHY
 8O9f9pr9gj8GMBwxeZQa
 =OIfB
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc driver updates from Greg KH:
 "Here is the big set of new char/misc driver drivers and features for
  4.12-rc1.

  There's lots of new drivers added this time around, new firmware
  drivers from Google, more auxdisplay drivers, extcon drivers, fpga
  drivers, and a bunch of other driver updates. Nothing major, except if
  you happen to have the hardware for these drivers, and then you will
  be happy :)

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-4.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (136 commits)
  firmware: google memconsole: Fix return value check in platform_memconsole_init()
  firmware: Google VPD: Fix return value check in vpd_platform_init()
  goldfish_pipe: fix build warning about using too much stack.
  goldfish_pipe: An implementation of more parallel pipe
  fpga fr br: update supported version numbers
  fpga: region: release FPGA region reference in error path
  fpga altera-hps2fpga: disable/unprepare clock on error in alt_fpga_bridge_probe()
  mei: drop the TODO from samples
  firmware: Google VPD sysfs driver
  firmware: Google VPD: import lib_vpd source files
  misc: lkdtm: Add volatile to intentional NULL pointer reference
  eeprom: idt_89hpesx: Add OF device ID table
  misc: ds1682: Add OF device ID table
  misc: tsl2550: Add OF device ID table
  w1: Remove unneeded use of assert() and remove w1_log.h
  w1: Use kernel common min() implementation
  uio_mf624: Align memory regions to page size and set correct offsets
  uio_mf624: Refactor memory info initialization
  uio: Allow handling of non page-aligned memory regions
  hangcheck-timer: Fix typo in comment
  ...
2017-05-04 19:15:35 -07:00
Linus Torvalds
89c9fea3c8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial tree updates from Jiri Kosina.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  tty: fix comment for __tty_alloc_driver()
  init/main: properly align the multi-line comment
  init/main: Fix double "the" in comment
  Fix dead URLs to ftp.kernel.org
  drivers: Clean up duplicated email address
  treewide: Fix typo in xml/driver-api/basics.xml
  tools/testing/selftests/powerpc: remove redundant CFLAGS in Makefile: "-Wall -O2 -Wall" -> "-O2 -Wall"
  selftests/timers: Spelling s/privledges/privileges/
  HID: picoLCD: Spelling s/REPORT_WRTIE_MEMORY/REPORT_WRITE_MEMORY/
  net: phy: dp83848: Fix Typo
  UBI: Fix typos
  Documentation: ftrace.txt: Correct nice value of 120 priority
  net: fec: Fix typo in error msg and comment
  treewide: Fix typos in printk
2017-05-02 19:09:35 -07:00
Christoph Hellwig
d6296d39e9 blk-mq: update ->init_request and ->exit_request prototypes
Remove the request_idx parameter, which can't be used safely now that we
support I/O schedulers with blk-mq.  Except for a superflous check in
mtip32xx it was unused anyway.

Also pass the tag_set instead of just the driver data - this allows drivers
to avoid some code duplication in a follow on cleanup.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-05-02 07:52:08 -06:00
Linus Torvalds
694752922b Merge branch 'for-4.12/block' of git://git.kernel.dk/linux-block
Pull block layer updates from Jens Axboe:

 - Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ
   was initially a fork of CFQ, but subsequently changed to implement
   fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant
   to be used on desktop type single drives, providing good fairness.
   From Paolo.

 - Add Kyber IO scheduler. This is a full multiqueue aware scheduler,
   using a scalable token based algorithm that throttles IO based on
   live completion IO stats, similary to blk-wbt. From Omar.

 - A series from Jan, moving users to separately allocated backing
   devices. This continues the work of separating backing device life
   times, solving various problems with hot removal.

 - A series of updates for lightnvm, mostly from Javier. Includes a
   'pblk' target that exposes an open channel SSD as a physical block
   device.

 - A series of fixes and improvements for nbd from Josef.

 - A series from Omar, removing queue sharing between devices on mostly
   legacy drivers. This helps us clean up other bits, if we know that a
   queue only has a single device backing. This has been overdue for
   more than a decade.

 - Fixes for the blk-stats, and improvements to unify the stats and user
   windows. This both improves blk-wbt, and enables other users to
   register a need to receive IO stats for a device. From Omar.

 - blk-throttle improvements from Shaohua. This provides a scalable
   framework for implementing scalable priotization - particularly for
   blk-mq, but applicable to any type of block device. The interface is
   marked experimental for now.

 - Bucketized IO stats for IO polling from Stephen Bates. This improves
   efficiency of polled workloads in the presence of mixed block size
   IO.

 - A few fixes for opal, from Scott.

 - A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics.
   From a variety of folks, mostly Sagi and James Smart.

 - A series from Bart, improving our exposed info and capabilities from
   the blk-mq debugfs support.

 - A series from Christoph, cleaning up how handle WRITE_ZEROES.

 - A series from Christoph, cleaning up the block layer handling of how
   we track errors in a request. On top of being a nice cleanup, it also
   shrinks the size of struct request a bit.

 - Removal of mg_disk and hd (sorry Linus) by Christoph. The former was
   never used by platforms, and the latter has outlived it's usefulness.

 - Various little bug fixes and cleanups from a wide variety of folks.

* 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits)
  block: hide badblocks attribute by default
  blk-mq: unify hctx delay_work and run_work
  block: add kblock_mod_delayed_work_on()
  blk-mq: unify hctx delayed_run_work and run_work
  nbd: fix use after free on module unload
  MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler
  blk-mq-sched: alloate reserved tags out of normal pool
  mtip32xx: use runtime tag to initialize command header
  scsi: Implement blk_mq_ops.show_rq()
  blk-mq: Add blk_mq_ops.show_rq()
  blk-mq: Show operation, cmd_flags and rq_flags names
  blk-mq: Make blk_flags_show() callers append a newline character
  blk-mq: Move the "state" debugfs attribute one level down
  blk-mq: Unregister debugfs attributes earlier
  blk-mq: Only unregister hctxs for which registration succeeded
  blk-mq-debugfs: Rename functions for registering and unregistering the mq directory
  blk-mq: Let blk_mq_debugfs_register() look up the queue name
  blk-mq: Register <dev>/queue/mq after having registered <dev>/queue
  ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset
  ide-pm: always pass 0 error to __blk_end_request_all
  ..
2017-05-01 10:39:57 -07:00
Eric Biggers
f363b089be blk-mq: constify struct blk_mq_ops
Constify all instances of blk_mq_ops, as they are never modified.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-03-31 08:28:58 -06:00
Sebastian Siewior
9cd9a21ce0 ubi/upd: Always flush after prepared for an update
In commit 6afaf8a484 ("UBI: flush wl before clearing update marker") I
managed to trigger and fix a similar bug. Now here is another version of
which I assumed it wouldn't matter back then but it turns out UBI has a
check for it and will error out like this:

|ubi0 warning: validate_vid_hdr: inconsistent used_ebs
|ubi0 error: validate_vid_hdr: inconsistent VID header at PEB 592

All you need to trigger this is? "ubiupdatevol /dev/ubi0_0 file" + a
powercut in the middle of the operation.
ubi_start_update() sets the update-marker and puts all EBs on the erase
list. After that userland can proceed to write new data while the old EB
aren't erased completely. A powercut at this point is usually not that
much of a tragedy. UBI won't give read access to the static volume
because it has the update marker. It will most likely set the corrupted
flag because it misses some EBs.
So we are all good. Unless the size of the image that has been written
differs from the old image in the magnitude of at least one EB. In that
case UBI will find two different values for `used_ebs' and refuse to
attach the image with the error message mentioned above.

So in order not to get in the situation, the patch will ensure that we
wait until everything is removed before it tries to write any data.
The alternative would be to detect such a case and remove all EBs at the
attached time after we processed the volume-table and see the
update-marker set. The patch looks bigger and I doubt it is worth it
since usually the write() will wait from time to time for a new EB since
usually there not that many spare EB that can be used.

Cc: stable@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2017-03-30 09:27:11 +02:00
Andrew F. Davis
2fae13124f UBI: Fix typos
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2017-03-24 15:33:32 +01:00
Logan Gunthorpe
493cfaeaa0 mtd: utilize new cdev_device_add helper function
This is not as straightforward a conversion as the others
in this series. These drivers did not originally make use of
kobj.parent so they likely suffered from a use after free bug if
someone unregistered the devices while they are being used.

In order to make the conversions, switch from device_register
to device_initialize / cdev_device_add.

In build.c, this patch unwinds a complicated mess of extra
get_device/put_devices and reference tracking by moving device_initialize
early in the attach process. Then it always uses put_device and instead of
using device_unregister and extra get_devices everywhere we just use
cdev_device_del and one put_device once everything is completely done.
This simplifies things dramatically and makes it easier to reason about.

In vmt.c, the patch pushes device initialization up to the beginning of the
device creation and then that function only needs to use put_device
in the error path which simplifies things a good deal.

Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-03-21 06:44:33 +01:00
David Howells
a528d35e8b statx: Add a system call to make enhanced file info available
Add a system call to make extended file information available, including
file creation and some attribute flags where available through the
underlying filesystem.

The getattr inode operation is altered to take two additional arguments: a
u32 request_mask and an unsigned int flags that indicate the
synchronisation mode.  This change is propagated to the vfs_getattr*()
function.

Functions like vfs_stat() are now inline wrappers around new functions
vfs_statx() and vfs_statx_fd() to reduce stack usage.

========
OVERVIEW
========

The idea was initially proposed as a set of xattrs that could be retrieved
with getxattr(), but the general preference proved to be for a new syscall
with an extended stat structure.

A number of requests were gathered for features to be included.  The
following have been included:

 (1) Make the fields a consistent size on all arches and make them large.

 (2) Spare space, request flags and information flags are provided for
     future expansion.

 (3) Better support for the y2038 problem [Arnd Bergmann] (tv_sec is an
     __s64).

 (4) Creation time: The SMB protocol carries the creation time, which could
     be exported by Samba, which will in turn help CIFS make use of
     FS-Cache as that can be used for coherency data (stx_btime).

     This is also specified in NFSv4 as a recommended attribute and could
     be exported by NFSD [Steve French].

 (5) Lightweight stat: Ask for just those details of interest, and allow a
     netfs (such as NFS) to approximate anything not of interest, possibly
     without going to the server [Trond Myklebust, Ulrich Drepper, Andreas
     Dilger] (AT_STATX_DONT_SYNC).

 (6) Heavyweight stat: Force a netfs to go to the server, even if it thinks
     its cached attributes are up to date [Trond Myklebust]
     (AT_STATX_FORCE_SYNC).

And the following have been left out for future extension:

 (7) Data version number: Could be used by userspace NFS servers [Aneesh
     Kumar].

     Can also be used to modify fill_post_wcc() in NFSD which retrieves
     i_version directly, but has just called vfs_getattr().  It could get
     it from the kstat struct if it used vfs_xgetattr() instead.

     (There's disagreement on the exact semantics of a single field, since
     not all filesystems do this the same way).

 (8) BSD stat compatibility: Including more fields from the BSD stat such
     as creation time (st_btime) and inode generation number (st_gen)
     [Jeremy Allison, Bernd Schubert].

 (9) Inode generation number: Useful for FUSE and userspace NFS servers
     [Bernd Schubert].

     (This was asked for but later deemed unnecessary with the
     open-by-handle capability available and caused disagreement as to
     whether it's a security hole or not).

(10) Extra coherency data may be useful in making backups [Andreas Dilger].

     (No particular data were offered, but things like last backup
     timestamp, the data version number and the DOS archive bit would come
     into this category).

(11) Allow the filesystem to indicate what it can/cannot provide: A
     filesystem can now say it doesn't support a standard stat feature if
     that isn't available, so if, for instance, inode numbers or UIDs don't
     exist or are fabricated locally...

     (This requires a separate system call - I have an fsinfo() call idea
     for this).

(12) Store a 16-byte volume ID in the superblock that can be returned in
     struct xstat [Steve French].

     (Deferred to fsinfo).

(13) Include granularity fields in the time data to indicate the
     granularity of each of the times (NFSv4 time_delta) [Steve French].

     (Deferred to fsinfo).

(14) FS_IOC_GETFLAGS value.  These could be translated to BSD's st_flags.
     Note that the Linux IOC flags are a mess and filesystems such as Ext4
     define flags that aren't in linux/fs.h, so translation in the kernel
     may be a necessity (or, possibly, we provide the filesystem type too).

     (Some attributes are made available in stx_attributes, but the general
     feeling was that the IOC flags were to ext[234]-specific and shouldn't
     be exposed through statx this way).

(15) Mask of features available on file (eg: ACLs, seclabel) [Brad Boyer,
     Michael Kerrisk].

     (Deferred, probably to fsinfo.  Finding out if there's an ACL or
     seclabal might require extra filesystem operations).

(16) Femtosecond-resolution timestamps [Dave Chinner].

     (A __reserved field has been left in the statx_timestamp struct for
     this - if there proves to be a need).

(17) A set multiple attributes syscall to go with this.

===============
NEW SYSTEM CALL
===============

The new system call is:

	int ret = statx(int dfd,
			const char *filename,
			unsigned int flags,
			unsigned int mask,
			struct statx *buffer);

The dfd, filename and flags parameters indicate the file to query, in a
similar way to fstatat().  There is no equivalent of lstat() as that can be
emulated with statx() by passing AT_SYMLINK_NOFOLLOW in flags.  There is
also no equivalent of fstat() as that can be emulated by passing a NULL
filename to statx() with the fd of interest in dfd.

Whether or not statx() synchronises the attributes with the backing store
can be controlled by OR'ing a value into the flags argument (this typically
only affects network filesystems):

 (1) AT_STATX_SYNC_AS_STAT tells statx() to behave as stat() does in this
     respect.

 (2) AT_STATX_FORCE_SYNC will require a network filesystem to synchronise
     its attributes with the server - which might require data writeback to
     occur to get the timestamps correct.

 (3) AT_STATX_DONT_SYNC will suppress synchronisation with the server in a
     network filesystem.  The resulting values should be considered
     approximate.

mask is a bitmask indicating the fields in struct statx that are of
interest to the caller.  The user should set this to STATX_BASIC_STATS to
get the basic set returned by stat().  It should be noted that asking for
more information may entail extra I/O operations.

buffer points to the destination for the data.  This must be 256 bytes in
size.

======================
MAIN ATTRIBUTES RECORD
======================

The following structures are defined in which to return the main attribute
set:

	struct statx_timestamp {
		__s64	tv_sec;
		__s32	tv_nsec;
		__s32	__reserved;
	};

	struct statx {
		__u32	stx_mask;
		__u32	stx_blksize;
		__u64	stx_attributes;
		__u32	stx_nlink;
		__u32	stx_uid;
		__u32	stx_gid;
		__u16	stx_mode;
		__u16	__spare0[1];
		__u64	stx_ino;
		__u64	stx_size;
		__u64	stx_blocks;
		__u64	__spare1[1];
		struct statx_timestamp	stx_atime;
		struct statx_timestamp	stx_btime;
		struct statx_timestamp	stx_ctime;
		struct statx_timestamp	stx_mtime;
		__u32	stx_rdev_major;
		__u32	stx_rdev_minor;
		__u32	stx_dev_major;
		__u32	stx_dev_minor;
		__u64	__spare2[14];
	};

The defined bits in request_mask and stx_mask are:

	STATX_TYPE		Want/got stx_mode & S_IFMT
	STATX_MODE		Want/got stx_mode & ~S_IFMT
	STATX_NLINK		Want/got stx_nlink
	STATX_UID		Want/got stx_uid
	STATX_GID		Want/got stx_gid
	STATX_ATIME		Want/got stx_atime{,_ns}
	STATX_MTIME		Want/got stx_mtime{,_ns}
	STATX_CTIME		Want/got stx_ctime{,_ns}
	STATX_INO		Want/got stx_ino
	STATX_SIZE		Want/got stx_size
	STATX_BLOCKS		Want/got stx_blocks
	STATX_BASIC_STATS	[The stuff in the normal stat struct]
	STATX_BTIME		Want/got stx_btime{,_ns}
	STATX_ALL		[All currently available stuff]

stx_btime is the file creation time, stx_mask is a bitmask indicating the
data provided and __spares*[] are where as-yet undefined fields can be
placed.

Time fields are structures with separate seconds and nanoseconds fields
plus a reserved field in case we want to add even finer resolution.  Note
that times will be negative if before 1970; in such a case, the nanosecond
fields will also be negative if not zero.

The bits defined in the stx_attributes field convey information about a
file, how it is accessed, where it is and what it does.  The following
attributes map to FS_*_FL flags and are the same numerical value:

	STATX_ATTR_COMPRESSED		File is compressed by the fs
	STATX_ATTR_IMMUTABLE		File is marked immutable
	STATX_ATTR_APPEND		File is append-only
	STATX_ATTR_NODUMP		File is not to be dumped
	STATX_ATTR_ENCRYPTED		File requires key to decrypt in fs

Within the kernel, the supported flags are listed by:

	KSTAT_ATTR_FS_IOC_FLAGS

[Are any other IOC flags of sufficient general interest to be exposed
through this interface?]

New flags include:

	STATX_ATTR_AUTOMOUNT		Object is an automount trigger

These are for the use of GUI tools that might want to mark files specially,
depending on what they are.

Fields in struct statx come in a number of classes:

 (0) stx_dev_*, stx_blksize.

     These are local system information and are always available.

 (1) stx_mode, stx_nlinks, stx_uid, stx_gid, stx_[amc]time, stx_ino,
     stx_size, stx_blocks.

     These will be returned whether the caller asks for them or not.  The
     corresponding bits in stx_mask will be set to indicate whether they
     actually have valid values.

     If the caller didn't ask for them, then they may be approximated.  For
     example, NFS won't waste any time updating them from the server,
     unless as a byproduct of updating something requested.

     If the values don't actually exist for the underlying object (such as
     UID or GID on a DOS file), then the bit won't be set in the stx_mask,
     even if the caller asked for the value.  In such a case, the returned
     value will be a fabrication.

     Note that there are instances where the type might not be valid, for
     instance Windows reparse points.

 (2) stx_rdev_*.

     This will be set only if stx_mode indicates we're looking at a
     blockdev or a chardev, otherwise will be 0.

 (3) stx_btime.

     Similar to (1), except this will be set to 0 if it doesn't exist.

=======
TESTING
=======

The following test program can be used to test the statx system call:

	samples/statx/test-statx.c

Just compile and run, passing it paths to the files you want to examine.
The file is built automatically if CONFIG_SAMPLES is enabled.

Here's some example output.  Firstly, an NFS directory that crosses to
another FSID.  Note that the AUTOMOUNT attribute is set because transiting
this directory will cause d_automount to be invoked by the VFS.

	[root@andromeda ~]# /tmp/test-statx -A /warthog/data
	statx(/warthog/data) = 0
	results=7ff
	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
	Device: 00:26           Inode: 1703937     Links: 125
	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
	Access: 2016-11-24 09:02:12.219699527+0000
	Modify: 2016-11-17 10:44:36.225653653+0000
	Change: 2016-11-17 10:44:36.225653653+0000
	Attributes: 0000000000001000 (-------- -------- -------- -------- -------- -------- ---m---- --------)

Secondly, the result of automounting on that directory.

	[root@andromeda ~]# /tmp/test-statx /warthog/data
	statx(/warthog/data) = 0
	results=7ff
	  Size: 4096            Blocks: 8          IO Block: 1048576  directory
	Device: 00:27           Inode: 2           Links: 125
	Access: (3777/drwxrwxrwx)  Uid:     0   Gid:  4041
	Access: 2016-11-24 09:02:12.219699527+0000
	Modify: 2016-11-17 10:44:36.225653653+0000
	Change: 2016-11-17 10:44:36.225653653+0000

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-02 20:51:15 -05:00
Christoph Hellwig
aebf526b53 block: fold cmd_type into the REQ_OP_ space
Instead of keeping two levels of indirection for requests types, fold it
all into the operations.  The little caveat here is that previously
cmd_type only applied to struct request, while the request and bio op
fields were set to plain REQ_OP_READ/WRITE even for passthrough
operations.

Instead this patch adds new REQ_OP_* for SCSI passthrough and driver
private requests, althought it has to add two for each so that we
can communicate the data in/out nature of the request.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@fb.com>
2017-01-31 14:00:44 -07:00
Boris Brezillon
40b6e61ac7 ubi: fastmap: Fix add_vol() return value test in ubi_attach_fastmap()
Commit e96a8a3bb6 ("UBI: Fastmap: Do not add vol if it already
exists") introduced a bug by changing the possible error codes returned
by add_vol():
- this function no longer returns NULL in case of allocation failure
  but return ERR_PTR(-ENOMEM)
- when a duplicate entry in the volume RB tree is found it returns
  ERR_PTR(-EEXIST) instead of ERR_PTR(-EINVAL)

Fix the tests done on add_vol() return val to match this new behavior.

Fixes: e96a8a3bb6 ("UBI: Fastmap: Do not add vol if it already exists")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Acked-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-28 14:48:18 +02:00
Geert Uytterhoeven
884a3b6478 UBI: Fix crash in try_recover_peb()
drivers/mtd/ubi/eba.c: In function ‘try_recover_peb’:
    drivers/mtd/ubi/eba.c:744: warning: ‘vid_hdr’ is used uninitialized in this function

The pointer vid_hdr is indeed not initialized, leading to a crash when
it is dereferenced.

Fix this by obtaining the pointer from the VID buffer, like is done
everywhere else.

Fixes: 3291b52f9f ("UBI: introduce the VID buffer concept")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-20 00:06:06 +02:00
Colin Ian King
a15cf34fed ubi: fix swapped arguments to call to ubi_alloc_aeb
Static analysis by CoverityScan detected the ec and pnum
arguments are in the wrong order on a call to ubi_alloc_aeb.
Swap the order to fix this.

Fixes: 91f4285fe3 ("UBI: provide helpers to allocate and free aeb elements")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-20 00:06:00 +02:00
Linus Torvalds
4c609922a3 This pull request contains:
* Fixes for both UBI and UBIFS
 * overlayfs support (O_TMPFILE, RENAME_WHITEOUT/EXCHANGE)
 * Code refactoring for the upcoming MLC support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJX/QOCAAoJEEtJtSqsAOnWtp4QAKItkx/LrW44rHhkoJfqG62i
 o+OaxMKNu43/v/io+68JNEkIqgEap2vMZVkfoIgIyuyPxMG7nA/zG3c2JFvQ/ReS
 uH0PmcpkIXbRBKe9IEn6rXmRz9q9UTNGhP2U5kg0rL22vwVGYIuzF4Bny25Irzf/
 LLtYOkpfZfaNTSjs1pmuJMWVFF1Rj68eVJEWL6JZ1BPQ4bRPbn5sNgOKNTJYkrJs
 GcXXNtonf3B0zOzFnmfFhVO5neo4FEG3QEQafR+qbhoNBvXSluVIAFoO4VKEcyHD
 BJbotsT64TBsBj7ol97EXxz+N6LkB3tNM3bFBvhAFXZ+EvrJ0o+2QoEOH0igWjMI
 4AXwSl6htCs+wRmqAqpJfZpfI7kv2MDUB9ZGAbuXRS888OK78Dzt1CupPW7Q12xh
 yYMNsXZvRvK82n0DfqBLQ53SIe/L3PotG2Cc29hjGaHjK+YcwVRvdp/2B3ID3O2L
 6ap/M6KA+i1SiYZI6yAEYT76jKOam9YG/psb76q66xILJ7h5XQOZODYQ9zC2towo
 Pjb+bCPzHZPm+v7xtSsP6aanZ+5xRXO91JjvsWl9UOQVDCA/Jt98H5qhCJZjIeIs
 OJ7z9PbTv0/jcBBRrjJyZIUE85omDliY4h04B3Yu44xa7Q9e7wbE+Vs/6L9txS0e
 L8TBNHmrYB7ZIprCIhcE
 =UB7l
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.9-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:
 "This pull request contains:

   - Fixes for both UBI and UBIFS
   - overlayfs support (O_TMPFILE, RENAME_WHITEOUT/EXCHANGE)
   - Code refactoring for the upcoming MLC support"

[ Ugh, we just got rid of the "rename2()" naming for the extended rename
  functionality. And this re-introduces it in ubifs with the cross-
  renaming and whiteout support.

  But rather than do any re-organizations in the merge itself, the
  naming can be cleaned up later ]

* tag 'upstream-4.9-rc1' of git://git.infradead.org/linux-ubifs: (27 commits)
  UBIFS: improve function-level documentation
  ubifs: fix host xattr_len when changing xattr
  ubifs: Use move variable in ubifs_rename()
  ubifs: Implement RENAME_EXCHANGE
  ubifs: Implement RENAME_WHITEOUT
  ubifs: Implement O_TMPFILE
  ubi: Fix Fastmap's update_vol()
  ubi: Fix races around ubi_refill_pools()
  ubi: Deal with interrupted erasures in WL
  UBI: introduce the VID buffer concept
  UBI: hide EBA internals
  UBI: provide an helper to query LEB information
  UBI: provide an helper to check whether a LEB is mapped or not
  UBI: add an helper to check lnum validity
  UBI: simplify LEB write and atomic LEB change code
  UBI: simplify recover_peb() code
  UBI: move the global ech and vidh variables into struct ubi_attach_info
  UBI: provide helpers to allocate and free aeb elements
  UBI: fastmap: use ubi_io_{read, write}_data() instead of ubi_io_{read, write}()
  UBI: fastmap: use ubi_rb_for_each_entry() in unmap_peb()
  ...
2016-10-11 10:49:44 -07:00
Richard Weinberger
f7d11b33d4 ubi: Fix Fastmap's update_vol()
Usually Fastmap is free to consider every PEB in one of the pools
as newer than the existing PEB. Since PEBs in a pool are by definition
newer than everything else.
But update_vol() missed the case that a pool can contain more than
one candidate.

Cc: <stable@vger.kernel.org>
Fixes: dbb7d2a88d ("UBI: Add fastmap core")
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:55:02 +02:00
Richard Weinberger
2e8f08deab ubi: Fix races around ubi_refill_pools()
When writing a new Fastmap the first thing that happens
is refilling the pools in memory.
At this stage it is possible that new PEBs from the new pools
get already claimed and written with data.
If this happens before the new Fastmap data structure hits the
flash and we face power cut the freshly written PEB will not
scanned and unnoticed.

Solve the issue by locking the pools until Fastmap is written.

Cc: <stable@vger.kernel.org>
Fixes: dbb7d2a88d ("UBI: Add fastmap core")
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:54:01 +02:00
Richard Weinberger
2365418879 ubi: Deal with interrupted erasures in WL
When Fastmap is used we can face here an -EBADMSG
since Fastmap cannot know about unmaps.
If the erasure was interrupted the PEB may show ECC
errors and UBI would go to ro-mode as it assumes
that the PEB was check during attach time, which is
not the case with Fastmap.

Cc: <stable@vger.kernel.org>
Fixes: dbb7d2a88d ("UBI: Add fastmap core")
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:49:54 +02:00
Boris Brezillon
3291b52f9f UBI: introduce the VID buffer concept
Currently, all VID headers are allocated and freed using the
ubi_zalloc_vid_hdr() and ubi_free_vid_hdr() function. These functions
make sure to align allocation on ubi->vid_hdr_alsize and adjust the
vid_hdr pointer to match the ubi->vid_hdr_shift requirements.
This works fine, but is a bit convoluted.
Moreover, the future introduction of LEB consolidation (needed to support
MLC/TLC NANDs) will allows a VID buffer to contain more than one VID
header.

Hence the creation of a ubi_vid_io_buf struct to attach extra information
to the VID header.

We currently only store the actual pointer of the underlying buffer, but
will soon add the number of VID headers contained in the buffer.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
799dca34ac UBI: hide EBA internals
Create a private ubi_eba_table struct to hide EBA internals and provide
helpers to allocate, destroy, copy and assing an EBA table to a volume.

Now that external EBA users are using helpers to query/modify the EBA
state we can safely change the internal representation, which will be
needed to support the LEB consolidation concept.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
1f81a5ccab UBI: provide an helper to query LEB information
This is part of our attempt to hide EBA internals from other part of the
implementation in order to easily adapt it to the MLC needs.

Here we are creating an ubi_eba_leb_desc struct to hide the way we keep
track of the LEB to PEB mapping.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
7554769641 UBI: provide an helper to check whether a LEB is mapped or not
This is part of the process of hiding UBI EBA's internal to other part of
the UBI implementation, so that we can add new information to the EBA
table without having to patch different places in the UBI code.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
9a5f09ac0a UBI: add an helper to check lnum validity
ubi_leb_valid() is here to replace the
lnum < 0 || lnum >= vol->reserved_pebs checks.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
2d78aee426 UBI: simplify LEB write and atomic LEB change code
ubi_eba_write_leb(), ubi_eba_write_leb_st() and
ubi_eba_atomic_leb_change() are using a convoluted retry/exit path.
Add the try_write_vid_and_data() function to simplify the retry logic
and make sure we have a single exit path instead of manually releasing
the resources in each error path.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
f036dfeb85 UBI: simplify recover_peb() code
recover_peb() is using a convoluted retry/exit path. Add try_recover_peb()
to simplify the retry logic and make sure we have a single exit path
instead of manually releasing the resource in each error path.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
7b6b749b12 UBI: move the global ech and vidh variables into struct ubi_attach_info
Even if it works fine with those global variables, attaching the
temporary ech and vidh objects used during UBI scan to the
ubi_attach_info object sounds like a more future-proof option.

For example, attaching several UBI devices in parallel is prevented by
this use of global variable. And also because global variables should
be avoided in general.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
91f4285fe3 UBI: provide helpers to allocate and free aeb elements
This not only hides the aeb allocation internals (which is always good in
case we ever want to change the allocation system), but also helps us
factorize the initialization of some common fields (ec and pnum).

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
fcbb6af17b UBI: fastmap: use ubi_io_{read, write}_data() instead of ubi_io_{read, write}()
ubi_io_{read,write}_data() are wrappers around ubi_io_{read/write}() that
are used to read/write eraseblock payload data, which is exactly what
fastmap does when calling ubi_io_{read,write}().

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
f2fb1346b3 UBI: fastmap: use ubi_rb_for_each_entry() in unmap_peb()
Use the ubi_rb_for_each_entry() macro instead of open-coding it.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
f9efe8d8a5 UBI: factorize destroy_av() and ubi_remove_av() code
Those functions are pretty much doing the same thing, except
ubi_remove_av() is putting the aeb elements attached to the volume into
the ai->erase list and the destroy_av() is freeing them.

Rework destroy_av() to handle both cases.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
de4c455b3e UBI: factorize code used to manipulate volumes at attach time
Volume creation/search code is duplicated in a few places (fastmap and
non fastmap code). Create some helpers to factorize the code.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
5f09aaa9b3 UBI: use vol->usable_leb_size instead of (ubi->leb_size - vol->data_pad)
vol->usable_size is already set to ubi->leb_size - vol->data_pad. Use
vol->usable_size instead of recalculating it.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
ecbfa8eaba UBI: fastmap: scrub PEB when bitflips are detected in a free PEB EC header
scan_pool() does not mark the PEB for scrubing when bitflips are
detected in the EC header of a free PEB (VID header region left to
0xff).
Make sure we scrub the PEB in this case.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Fixes: dbb7d2a88d ("UBI: Add fastmap core")
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
4c368421bc UBI: fastmap: avoid multiple be32_to_cpu() when unneccesary
process_pool_aeb() does several times the be32_to_cpu(new_vh->vol_id)
operation. Create a temporary variable and do it once.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
3f84e454eb UBI: fix add_fastmap() to use the vid_hdr passed in argument
add_fastmap() is passed a ubi_vid_hdr pointer in argument, but is
referencing the global vidh pointer.
Even if this is correct from a functional point of view (vidh and vid_hdr
point to the same object), it is confusing.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Boris Brezillon
f5f9d43f0a UBI: fastmap: use ubi_find_volume() instead of open coding it
process_pool_aeb() re-implements the logic found in ubi_find_volume().
Call ubi_find_volume() to avoid this duplication.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-10-02 22:48:14 +02:00
Christoph Hellwig
7d7e0f90b7 blk-mq: remove ->map_queue
All drivers use the default, so provide an inline version of it.  If we
ever need other queue mapping we can add an optional method back,
although supporting will also require major changes to the queue setup
code.

This provides better code generation, and better debugability as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-09-15 08:42:03 -06:00
Richard Weinberger
5d71afb008 ubi: Use bitmaps in Fastmap self-check code
...don't waste memory by allocating one sizeof(int) per
PEB.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:32:58 +02:00
Richard Weinberger
74f2c6e9a4 ubi: Be more paranoid while seaching for the most recent Fastmap
Since PEB erasure is asynchornous it can happen that there is
more than one Fastmap on the MTD. This is fine because the attach logic
will pick the Fastmap data structure with the highest sequence number.

On a not so well configured MTD stack spurious ECC errors are common.
Causes can be different, bad hardware, wrong operating modes, etc...
If the most current Fastmap renders bad due to ECC errors UBI might
pick an older Fastmap to attach from.
While this can only happen on an anyway broken setup it will show
completely different sympthoms and makes finding the root cause much
more difficult.
So, be debug friendly and fall back to scanning mode of we're facing
an ECC error while scanning for Fastmap.

Cc: <stable@vger.kernel.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:32:54 +02:00
Richard Weinberger
5283ec72b0 ubi: Check whether the Fastmap anchor matches the super block
This helps to detect cases where an user copies an UBI image to
another target with different bad blocks.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:32:43 +02:00
Richard Weinberger
fdf10ed710 ubi: Rework Fastmap attach base code
Introduce a new list to the UBI attach information
object to be able to deal better with old and corrupted
Fastmap eraseblocks.
Also move more Fastmap specific code into fastmap.c.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:32:42 +02:00
Richard Weinberger
be8011053f ubi: Fix whitespace issue in count_fastmap_pebs()
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:32:40 +02:00
Richard Weinberger
243a4f8126 ubi: Introduce vol_ignored()
This makes the logic more easy to follow.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:32:39 +02:00
Richard Weinberger
d139d300a9 ubi: Fix scan_fast() comment
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:32:38 +02:00
Richard Weinberger
4946784bd3 ubi: Make volume resize power cut aware
When the volume resize operation shrinks a volume,
LEBs will be unmapped. Since unmapping will not erase these
LEBs immediately we have to wait for that operation to finish.
Otherwise in case of a power cut right after writing the new
volume table the UBI attach process can find more LEBs than the
volume table knows. This will render the UBI image unattachable.

Fix this issue by waiting for erase to complete and write the new
volume table afterward.

Cc: <stable@vger.kernel.org>
Reported-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:32:01 +02:00
Richard Weinberger
bc743f34df ubi: Fix early logging
We cannot use ubi_* logging functions before the UBI
object is initialized.

Cc: <stable@vger.kernel.org>
Fixes: 3260870331 ("UBI: Extend UBI layer debug/messaging capabilities")
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:31:55 +02:00
Richard Weinberger
95b54e1e10 ubi: gluebi: Fix double refcounting
There is no need to call get/put on the module
reference in gluebi_get/put_device() callbacks.
Since mtd->owner is the gluebi module itself
mtdcore.c will take care of proper refcounting
in __get/put_mtd_device() before executing the
callbacks.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:31:54 +02:00
Iosif Harutyunov
714fb87e8b ubi: Fix race condition between ubi device creation and udev
Install the UBI device object before we arm sysfs.
Otherwise udev tries to read sysfs attributes before UBI is ready and
udev rules will not match.

Cc: <stable@vger.kernel.org>
Signed-off-by: Iosif Harutyunov <iharutyunov@sonicwall.com>
[rw: massaged commit message]
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:30:28 +02:00
Sascha Hauer
8a8e8d2fdb ubi: Only read necessary size when reading the VID header
When reading the vid hdr from the device UBI always reads a whole
page. Instead, read only the data we actually need and speed up
attachment of UBI devices by potentially making use of reading
subpages if the NAND driver supports it.

Since the VID header may be at offset vid_hdr_shift in the page and
we can only read from the beginning of a page we have to add that
offset to the read size.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-07-29 23:29:44 +02:00
Richard Weinberger
972228d874 ubi: Make recover_peb power cut aware
recover_peb() was never power cut aware,
if a power cut happened right after writing the VID header
upon next attach UBI would blindly use the new partial written
PEB and all data from the old PEB is lost.

In order to make recover_peb() power cut aware, write the new
VID with a proper crc and copy_flag set such that the UBI attach
process will detect whether the new PEB is completely written
or not.
We cannot directly use ubi_eba_atomic_leb_change() since we'd
have to unlock the LEB which is facing a write error.

Cc: stable@vger.kernel.org
Reported-by: Jörg Pfähler <pfaehler@isse.de>
Reviewed-by: Jörg Pfähler <pfaehler@isse.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-06-23 00:29:32 +02:00
Richard Weinberger
61edc3f3b5 ubi: Don't bypass ->getattr()
Directly accessing inode fields bypasses ->getattr()
and can cause problems when the underlying filesystem
does not have the default ->getattr() implementation.

So instead of obtaining the backing inode via d_backing_inode()
use vfs_getattr() and obtain what we need from the kstat struct.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-06-14 10:51:42 +02:00
Richard Weinberger
1a498ec45e Revert "mtd: switch open_mtd_by_chdev() to use of vfs_stat()"
This reverts commit 87f15d4add.

vfs_stat() can only be used on user supplied buffers.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-06-14 10:51:42 +02:00
Richard Weinberger
ad022c8718 Revert "mtd: switch ubi_open_volume_path() to vfs_stat()"
This reverts commit 322ea0bbf3.

vfs_stat() can only be used on user supplied buffers.
UBI's kapi.c is the API to the kernel and therefore vfs_stat()
is inappropriate.

This solves the problem that mounting any UBIFS will immediately
fail with -EINVAL.

Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-06-14 10:51:42 +02:00
Linus Torvalds
23a3e178b9 This pull request contains mostly cleanups and minor
improvements of UBI and UBIFS.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJXSJl6AAoJEEtJtSqsAOnWbbAP/2ls5KpGsRSoOP4NsKo5+8k9
 GsZ8hb0iN+UGDxMB5Jlj6O1ISNPz/8o7iGBuWa5OWFdh4Nn38sA1Qv996Rg3Ca3O
 8KYEGAD7POWANfxCTmyQX/L8AsOP62B3diFktPyGrtYmLsVzx/AFN9/nM27ticdF
 PQpUtLwvL/m7/oE6ymFCB4x34+DkTa7Jx4e+chVlF61CUipGc5I60VVX1Do+/DWt
 S37ajjIxMJx0BO7Zs6vrcH9uGFSbvSpVBr9/sC16t0Fa1/vAa9pXieg3y5XZmtOl
 +6NlwUxXVu5IjHkYYGZP8jsrA7bKJlflgCo/w6WKC6V0KECnBx+oSG9uMjDVPBSM
 rc4VF4nevnNLNqFGBRWhxiYrRUCdB1TcWVDOzM16peNeUNJiWj/+4PbsiJQBwECH
 ZnE/dBrjU7v+J8UmHsUJSveW6/5PluJf1loDzrip7gk5QD13wdpPo3VOycOj1vMD
 iS6H/TBjtcEzWWh7qcBbtgkbHQH0g+w9YSzVy7ODFweuoq+4qCZ1davqzzZaRv3q
 hylor1gea3b4GnH1EDjmXyQOOuIf2rLdJhou8WoZhmy2q3FrnE+89/CCOCMkeQMQ
 qQr1lAbxD7A+8bYwSZ7Nvv6lrBUrat59DV6FHFc69MdkGT7J6z85ZOyd3lxRvJc/
 XiZZkKhrYSKgV/iLFhHW
 =L16y
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.7-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:
 "This contains mostly cleanups and minor improvements of UBI and UBIFS"

* tag 'upstream-4.7-rc1' of git://git.infradead.org/linux-ubifs:
  ubifs: ubifs_dump_inode: Fix dumping field bulk_read
  UBI: Fix static volume checks when Fastmap is used
  UBI: Set free_count to zero before walking through erase list
  UBI: Silence an unintialized variable warning
  UBI: Clean up return in ubi_remove_volume()
  UBI: Modify wrong comment in ubi_leb_map function.
  UBI: Don't read back all data in ubi_eba_copy_leb()
  UBI: Add ro-mode sysfs attribute
2016-05-27 18:49:29 -07:00
Richard Weinberger
1900149c83 UBI: Fix static volume checks when Fastmap is used
Ezequiel reported that he's facing UBI going into read-only
mode after power cut. It turned out that this behavior happens
only when updating a static volume is interrupted and Fastmap is
used.

A possible trace can look like:
ubi0 warning: ubi_io_read_vid_hdr [ubi]: no VID header found at PEB 2323, only 0xFF bytes
ubi0 warning: ubi_eba_read_leb [ubi]: switch to read-only mode
CPU: 0 PID: 833 Comm: ubiupdatevol Not tainted 4.6.0-rc2-ARCH #4
Hardware name: SAMSUNG ELECTRONICS CO., LTD. 300E4C/300E5C/300E7C/NP300E5C-AD8AR, BIOS P04RAP 10/15/2012
0000000000000286 00000000eba949bd ffff8800c45a7b38 ffffffff8140d841
ffff8801964be000 ffff88018eaa4800 ffff8800c45a7bb8 ffffffffa003abf6
ffffffff850e2ac0 8000000000000163 ffff8801850e2ac0 ffff8801850e2ac0
Call Trace:
[<ffffffff8140d841>] dump_stack+0x63/0x82
[<ffffffffa003abf6>] ubi_eba_read_leb+0x486/0x4a0 [ubi]
[<ffffffffa00453b3>] ubi_check_volume+0x83/0xf0 [ubi]
[<ffffffffa0039d97>] ubi_open_volume+0x177/0x350 [ubi]
[<ffffffffa00375d8>] vol_cdev_open+0x58/0xb0 [ubi]
[<ffffffff8124b08e>] chrdev_open+0xae/0x1d0
[<ffffffff81243bcf>] do_dentry_open+0x1ff/0x300
[<ffffffff8124afe0>] ? cdev_put+0x30/0x30
[<ffffffff81244d36>] vfs_open+0x56/0x60
[<ffffffff812545f4>] path_openat+0x4f4/0x1190
[<ffffffff81256621>] do_filp_open+0x91/0x100
[<ffffffff81263547>] ? __alloc_fd+0xc7/0x190
[<ffffffff812450df>] do_sys_open+0x13f/0x210
[<ffffffff812451ce>] SyS_open+0x1e/0x20
[<ffffffff81a99e32>] entry_SYSCALL_64_fastpath+0x1a/0xa4

UBI checks static volumes for data consistency and reads the
whole volume upon first open. If the volume is found erroneous
users of UBI cannot read from it, but another volume update is
possible to fix it. The check is performed by running
ubi_eba_read_leb() on every allocated LEB of the volume.
For static volumes ubi_eba_read_leb() computes the checksum of all
data stored in a LEB. To verify the computed checksum it has to read
the LEB's volume header which stores the original checksum.
If the volume header is not found UBI treats this as fatal internal
error and switches to RO mode. If the UBI device was attached via a
full scan the assumption is correct, the volume header has to be
present as it had to be there while scanning to get known as mapped.
If the attach operation happened via Fastmap the assumption is no
longer correct. When attaching via Fastmap UBI learns the mapping
table from Fastmap's snapshot of the system state and not via a full
scan. It can happen that a LEB got unmapped after a Fastmap was
written to the flash. Then UBI can learn the LEB still as mapped and
accessing it returns only 0xFF bytes. As UBI is not a FTL it is
allowed to have mappings to empty PEBs, it assumes that the layer
above takes care of LEB accounting and referencing.
UBIFS does so using the LEB property tree (LPT).
For static volumes UBI blindly assumes that all LEBs are present and
therefore special actions have to be taken.

The described situation can happen when updating a static volume is
interrupted, either by a user or a power cut.
The volume update code first unmaps all LEBs of a volume and then
writes LEB by LEB. If the sequence of operations is interrupted UBI
detects this either by the absence of LEBs, no volume header present
at scan time, or corrupted payload, detected via checksum.
In the Fastmap case the former method won't trigger as no scan
happened and UBI automatically thinks all LEBs are present.
Only by reading data from a LEB it detects that the volume header is
missing and incorrectly treats this as fatal error.
To deal with the situation ubi_eba_read_leb() from now on checks
whether we attached via Fastmap and handles the absence of a
volume header like a data corruption error.
This way interrupted static volume updates will correctly get detected
also when Fastmap is used.

Cc: <stable@vger.kernel.org>
Reported-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Tested-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-05-24 15:24:37 +02:00
Heiko Schocher
73b0cd57fc UBI: Set free_count to zero before walking through erase list
Set free_count to zero before walking through ai->erase list
in wl_init().

Found in U-Boot as U-Boot has no workqueue/threads, it immediately
calls erase_worker(), which increase for each erased block
free_count. Without this patch, free_count gets after
this initialized to zero in wl_init(), so the free_count
variable always has the maybe wrong value 0 in U-Boot.

Signed-off-by: Heiko Schocher <hs@denx.de>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-05-24 15:24:33 +02:00
Dan Carpenter
24663e7281 UBI: Silence an unintialized variable warning
My static checker complains that "val" is uninitialized when kstrtoint()
fails.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-05-24 15:24:30 +02:00
Dan Carpenter
fadb3665ba UBI: Clean up return in ubi_remove_volume()
My static checker says that "err" can be uninitialized if
"vol->reserved_pebs" is <= 0.  I don't think that can happen but
returning a literal is cleaner anyway.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-05-24 15:24:26 +02:00
z00189512
960b35d06b UBI: Modify wrong comment in ubi_leb_map function.
Signed-off-by: z00189512 <abc.zhangliang@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-05-24 15:24:20 +02:00
Richard Weinberger
1e0a74f10d UBI: Don't read back all data in ubi_eba_copy_leb()
Drop this paranoia check from the old days.
If our MTD driver or the flash is so bad that we even cannot
trust it to write data we have bigger problems.

If one really does not trust the flash and wants write-verify
she can enable UBI io checks using debugfs.

Signed-off-by: Richard Weinberger <richard@nod.at>
2016-05-24 15:21:01 +02:00
Ezequiel Garcia
525bab71fe UBI: Add ro-mode sysfs attribute
On serious situations, UBI may detect serious device corruption,
and switch to read-only mode to protect the data and allow debugging.
This commit exposes this ro-mode on sysfs, so it can be obtained
by userspace tools.

Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-05-24 15:15:26 +02:00
Al Viro
322ea0bbf3 mtd: switch ubi_open_volume_path() to vfs_stat()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-27 23:49:17 -04:00
Al Viro
87f15d4add mtd: switch open_mtd_by_chdev() to use of vfs_stat()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-03-27 23:49:12 -04:00
Joe Perches
58d303def2 mtd: ubi: Add logging functions ubi_msg, ubi_warn and ubi_err
Using logging functions instead of macros can reduce overall object size.

$ size drivers/mtd/ubi/built-in.o*
   text	   data	    bss	    dec	    hex	filename
 271620	 163364	  73696	 508680	  7c308	drivers/mtd/ubi/built-in.o.allyesconfig.new
 287638	 165380	  73504	 526522	  808ba	drivers/mtd/ubi/built-in.o.allyesconfig.old
  87728	   3780	    504	  92012	  1676c	drivers/mtd/ubi/built-in.o.defconfig.new
  97084	   3780	    504	 101368	  18bf8	drivers/mtd/ubi/built-in.o.defconfig.old

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-03-20 21:36:05 +01:00
Richard Weinberger
e4f6daac20 ubi: Fix out of bounds write in volume update code
ubi_start_leb_change() allocates too few bytes.
ubi_more_leb_change_data() will write up to req->upd_bytes +
ubi->min_io_size bytes.

Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2016-03-05 21:56:23 +01:00
Al Viro
5955102c99 wrappers for ->i_mutex access
parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
inode_foo(inode) being mutex_foo(&inode->i_mutex).

Please, use those for access to ->i_mutex; over the coming cycle
->i_mutex will become rwsem, with ->lookup() done with it held
only shared.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-01-22 18:04:28 -05:00
Sebastian Siewior
34b89df903 mtd: ubi: wl: avoid erasing a PEB which is empty
wear_leveling_worker() currently unconditionally puts a PEB on erase in
the error case even it just been taken from the free_list and never
used.
In case the PEB was never used it can be put back on the free list
saving a precious erase cycle.

v1…v2:
	- to_leb_clean -> dst_leb_clean
	- use the nested option for ensure_wear_leveling()
	- do_sync_erase() can't go -ENOMEM so we can just go into
	  RO-mode now.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2016-01-10 12:33:11 +01:00
Sebastian Siewior
6b238de189 mtd: ubi: don't leak e if schedule_erase() fails
If __erase_worker() fails to erase the EB and schedule_erase() fails as
well to do anything about it then we go RO. But that is not a reason to
leak the e argument here. Therefore clean up e.

Cc: <stable@vger.kernel.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-16 22:59:03 +01:00
Sebastian Siewior
1a31b20cd8 mtd: ubi: fixup error correction in do_sync_erase()
Since fastmap we gained do_sync_erase(). This function can return an error
and its error handling isn't obvious. First the memory allocation for
struct ubi_work can fail and as such struct ubi_wl_entry is leaked.
However if the memory allocation succeeds then the tail function takes
care of the struct ubi_wl_entry. A free here could result in a double
free.
To make the error handling simpler, I split the tail function into one
piece which does the work and another which frees the struct ubi_work
which is passed as argument. As result do_sync_erase() can keep the
struct on stack and we get rid of one error source.

Cc: <stable@vger.kernel.org>
Fixes: 8199b901a ("UBI: Add fastmap support to the WL sub-system")
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-16 22:52:46 +01:00
Brian Norris
2e69d4912f UBI: fix use of "VID" vs. "EC" in header self-check
Looks like a typo, using UBI_EC_HDR_SIZE_CRC (note the "EC") to compute
the CRC for the VID header.

This shouldn't cause any functional change, as both structures are 64
bytes. Verified with:

	BUILD_BUG_ON(UBI_VID_HDR_SIZE_CRC != UBI_EC_HDR_SIZE_CRC);

Reported here:
http://lists.infradead.org/pipermail/linux-mtd/2013-September/048570.html

Reported by: Bill Pringlemeir <bpringlemeir@gmail.com>
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-16 22:46:26 +01:00
Sudip Mukherjee
97cb69dd80 UBI: fix return error code
We are checking dfs_rootdir for error value or NULL. But in the
conditional ternary operator we returned -ENODEV if dfs_rootdir contains
an error value and returned PTR_ERR(dfs_rootdir) if dfs_rootdir is NULL.
So in the case of dfs_rootdir being NULL we actually assigned 0 to err
and returned it to the caller implying a success.
Lets return -ENODEV when dfs_rootdir is NULL else return
PTR_ERR(dfs_rootdir).

Signed-off-by: Sudip Mukherjee <sudip@vectorindia.org>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-12-16 22:45:04 +01:00
Linus Torvalds
01504f5e9e This pull request includes the following UBI/UBIFS changes:
* access time support for UBIFS by Dongsheng Yang
 * random cleanups and bug fixes all over the place
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABAgAGBQJWQmnVAAoJEEtJtSqsAOnWpnkP/0pUN+Cx58QJoJXxaSgClH/l
 QwYGznz4Fv3QuSafXJZy8CPzsTasaOOtoVKJPfPTvP8Drrl3FFbtdhqlqdx1YKqw
 DuKeViwmE/HF49a5pxgtR8032l35s3zZUWwd2yKA3B6SVEIsjAoWlIutyzKA/DiR
 R4mGVsfRevfzq1+l2YXGGZ8PJXMtRKnFTjzmIUJY0OeT3xCvl/uUFCid2x693aut
 GpJ6AMKGYcU+BuQJmb89wvdRYCHPdYEJr7TWq+GBxU3gDPlLY9302IhPq9ftv8Lp
 ClA0X3GZqVPR921XR+aA4ysRbfENT+UBewiBt2OEwQLnI6jxEn9EqsgErATxpOYy
 PEduYuU56mqc67EE4r7+YHfkidMSa8KrR9s+GpzR6JStW9HNGFDfGfCAK7BcdvWT
 +TuHfNk787Gbl4TO41iFI87s6zNQszYOpxBTEFYHx0FQWLNxD5c79UkisL+biqvJ
 bX8+7pClJQXcY0CXqe56GVQLC5v4HbvMmnstiNk52ihxEB0NT6htzWe/GPb/+UZ1
 K+kOAT54DPeag3FK4J1WZ57cYyJHdbyASB5vKvb+A1IwzrYFo8P35GekoWNK7eAJ
 Od5mb++7pTEFSH7oSwsprPOmrHLRmXgRG8yhLRfqX1ZwFeyFdqGQxjce9O/QB1jo
 jr4dxAY6Gnjdo1q07QSI
 =9n2S
 -----END PGP SIGNATURE-----

Merge tag 'upstream-4.4-rc1' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS updates from Richard Weinberger:

 - access time support for UBIFS by Dongsheng Yang

 - random cleanups and bug fixes all over the place

* tag 'upstream-4.4-rc1' of git://git.infradead.org/linux-ubifs:
  ubifs: introduce UBIFS_ATIME_SUPPORT to ubifs
  ubifs: make ubifs_[get|set]xattr atomic
  UBIFS: Delete unnecessary checks before the function call "iput"
  UBI: Remove in vain semicolon
  UBI: Fastmap: Fix PEB array type
  UBIFS: Fix possible memory leak in ubifs_readdir()
  fs/ubifs: remove unnecessary new_valid_dev check
  ubi: fastmap: Implement produce_free_peb()
  UBIFS: print verbose message when rescanning a corrupted node
  UBIFS: call dbg_is_power_cut() instead of reading c->dbg->pc_happened
  UBI: drop null test before destroy functions
  UBI: Update comments to reflect UBI_METAONLY flag
  UBI: Fix debug message
  UBI: Fix typo in comment
  UBI: Fastmap: Simplify expression
  UBIFS: fix a typo in comment of ubifs_budget_req
  UBIFS: use kmemdup rather than duplicating its implementation
2015-11-10 16:35:06 -08:00
Linus Torvalds
75021d2859 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial
Pull trivial updates from Jiri Kosina:
 "Trivial stuff from trivial tree that can be trivially summed up as:

   - treewide drop of spurious unlikely() before IS_ERR() from Viresh
     Kumar

   - cosmetic fixes (that don't really affect basic functionality of the
     driver) for pktcdvd and bcache, from Julia Lawall and Petr Mladek

   - various comment / printk fixes and updates all over the place"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  bcache: Really show state of work pending bit
  hwmon: applesmc: fix comment typos
  Kconfig: remove comment about scsi_wait_scan module
  class_find_device: fix reference to argument "match"
  debugfs: document that debugfs_remove*() accepts NULL and error values
  net: Drop unlikely before IS_ERR(_OR_NULL)
  mm: Drop unlikely before IS_ERR(_OR_NULL)
  fs: Drop unlikely before IS_ERR(_OR_NULL)
  drivers: net: Drop unlikely before IS_ERR(_OR_NULL)
  drivers: misc: Drop unlikely before IS_ERR(_OR_NULL)
  UBI: Update comments to reflect UBI_METAONLY flag
  pktcdvd: drop null test before destroy functions
2015-11-07 13:05:44 -08:00
Richard Weinberger
a396ce4bd2 UBI: Remove in vain semicolon
...found while browsing.

Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 23:26:51 +01:00
Ezequiel García
2a130f1aa5 UBI: Fastmap: Fix PEB array type
The PEB array is an array of __be32, so let's fix the
scan_pool() prototype accordingly.

Signed-off-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-11-06 23:26:50 +01:00
Richard Weinberger
1cb8f9776c ubi: fastmap: Implement produce_free_peb()
If fastmap requests a free PEB for a pool and UBI is busy
with erasing PEBs we need to offer a function to wait for one.
We can reuse produce_free_peb() from the non-fastmap WL code
but with different locking semantics.

Cc: stable@vger.kernel.org # 4.1.x-
Reported-and-tested-by: Jörg Krause <joerg.krause@embedded.rocks>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-03 23:25:30 +02:00
Julia Lawall
f9a113d65f UBI: drop null test before destroy functions
Remove unneeded NULL test.

The semantic patch that makes this change is as follows:
(http://coccinelle.lip6.fr/)

// <smpl>
@@ expression x; @@
-if (x != NULL)
  \(kmem_cache_destroy\|mempool_destroy\|dma_pool_destroy\)(x);
// </smpl>

Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-03 20:36:10 +02:00
Andrew Murray
86f6e454e6 UBI: Update comments to reflect UBI_METAONLY flag
This patch trivially updates code comments to reflect the addition of the
UBI_METAONLY flag - as discussed https://lkml.org/lkml/2014/10/29/764

Cc: Richard Weinberger <richard@nod.at>
Cc: trivial@kernel.org
Signed-off-by: Andrew Murray <amurray@embedded-bits.co.uk>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-10-03 20:11:59 +02:00
Richard Weinberger
5347417e56 UBI: Fix debug message
We have to use j instead of i. i is the volume id
and not the block.

Reported-by: Alexander.Block@continental-corporation.com
Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Brian Norris <computersforpeace@gmail.com>
2015-10-03 20:09:55 +02:00
Richard Weinberger
4ebb4c9dcb UBI: Fix typo in comment
While we are here fix a s/beween/between typo.

Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Brian Norris <computersforpeace@gmail.com>
2015-10-03 20:09:41 +02:00
Richard Weinberger
876f9a3487 UBI: Fastmap: Simplify expression
There is no need to compute pnum again.

Signed-off-by: Richard Weinberger <richard@nod.at>
Acked-by: Brian Norris <computersforpeace@gmail.com>
2015-10-03 20:08:46 +02:00
Andrew Murray
061eebba3d UBI: Update comments to reflect UBI_METAONLY flag
This patch trivially updates code comments to reflect the addition of the
UBI_METAONLY flag - as discussed https://lkml.org/lkml/2014/10/29/764

Cc: Richard Weinberger <richard@nod.at>
Cc: trivial@kernel.org
Signed-off-by: Andrew Murray <amurray@embedded-bits.co.uk>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2015-09-29 15:09:02 +02:00
shengyong
7c7feb2ebf UBI: return ENOSPC if no enough space available
UBI: attaching mtd1 to ubi0
UBI: scanning is finished
UBI error: init_volumes: not enough PEBs, required 706, available 686
UBI error: ubi_wl_init: no enough physical eraseblocks (-20, need 1)
UBI error: ubi_attach_mtd_dev: failed to attach mtd1, error -12 <= NOT ENOMEM
UBI error: ubi_init: cannot attach mtd1

If available PEBs are not enough when initializing volumes, return -ENOSPC
directly. If available PEBs are not enough when initializing WL, return
-ENOSPC instead of -ENOMEM.

Cc: stable@vger.kernel.org
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: David Gstir <david@sigma-star.at>
2015-09-29 12:47:05 +02:00
Richard Weinberger
281fda2767 UBI: Validate data_size
Make sure that data_size is less than LEB size.
Otherwise a handcrafted UBI image is able to trigger
an out of bounds memory access in ubi_compare_lebs().

Cc: stable@vger.kernel.org
Signed-off-by: Richard Weinberger <richard@nod.at>
Reviewed-by: David Gstir <david@sigma-star.at>
2015-09-29 12:47:04 +02:00
Linus Torvalds
02201e3f1b Minor merge needed, due to function move.
Main excitement here is Peter Zijlstra's lockless rbtree optimization to
 speed module address lookup.  He found some abusers of the module lock
 doing that too.
 
 A little bit of parameter work here too; including Dan Streetman's breaking
 up the big param mutex so writing a parameter can load another module (yeah,
 really).  Unfortunately that broke the usual suspects, !CONFIG_MODULES and
 !CONFIG_SYSFS, so those fixes were appended too.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVkgKHAAoJENkgDmzRrbjxQpwQAJVmBN6jF3SnwbQXv9vRixjH
 58V33sb1G1RW+kXxQ3/e8jLX/4VaN479CufruXQp+IJWXsN/CH0lbC3k8m7u50d7
 b1Zeqd/Yrh79rkc11b0X1698uGCSMlzz+V54Z0QOTEEX+nSu2ZZvccFS4UaHkn3z
 rqDo00lb7rxQz8U25qro2OZrG6D3ub2q20TkWUB8EO4AOHkPn8KWP2r429Axrr0K
 wlDWDTTt8/IsvPbuPf3T15RAhq1avkMXWn9nDXDjyWbpLfTn8NFnWmtesgY7Jl4t
 GjbXC5WYekX3w2ZDB9KaT/DAMQ1a7RbMXNSz4RX4VbzDl+yYeSLmIh2G9fZb1PbB
 PsIxrOgy4BquOWsJPm+zeFPSC3q9Cfu219L4AmxSjiZxC3dlosg5rIB892Mjoyv4
 qxmg6oiqtc4Jxv+Gl9lRFVOqyHZrTC5IJ+xgfv1EyP6kKMUKLlDZtxZAuQxpUyxR
 HZLq220RYnYSvkWauikq4M8fqFM8bdt6hLJnv7bVqllseROk9stCvjSiE3A9szH5
 OgtOfYV5GhOeb8pCZqJKlGDw+RoJ21jtNCgOr6DgkNKV9CX/kL/Puwv8gnA0B0eh
 dxCeB7f/gcLl7Cg3Z3gVVcGlgak6JWrLf5ITAJhBZ8Lv+AtL2DKmwEWS/iIMRmek
 tLdh/a9GiCitqS0bT7GE
 =tWPQ
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module updates from Rusty Russell:
 "Main excitement here is Peter Zijlstra's lockless rbtree optimization
  to speed module address lookup.  He found some abusers of the module
  lock doing that too.

  A little bit of parameter work here too; including Dan Streetman's
  breaking up the big param mutex so writing a parameter can load
  another module (yeah, really).  Unfortunately that broke the usual
  suspects, !CONFIG_MODULES and !CONFIG_SYSFS, so those fixes were
  appended too"

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (26 commits)
  modules: only use mod->param_lock if CONFIG_MODULES
  param: fix module param locks when !CONFIG_SYSFS.
  rcu: merge fix for Convert ACCESS_ONCE() to READ_ONCE() and WRITE_ONCE()
  module: add per-module param_lock
  module: make perm const
  params: suppress unused variable error, warn once just in case code changes.
  modules: clarify CONFIG_MODULE_COMPRESS help, suggest 'N'.
  kernel/module.c: avoid ifdefs for sig_enforce declaration
  kernel/workqueue.c: remove ifdefs over wq_power_efficient
  kernel/params.c: export param_ops_bool_enable_only
  kernel/params.c: generalize bool_enable_only
  kernel/module.c: use generic module param operaters for sig_enforce
  kernel/params: constify struct kernel_param_ops uses
  sysfs: tightened sysfs permission checks
  module: Rework module_addr_{min,max}
  module: Use __module_address() for module_address_lookup()
  module: Make the mod_tree stuff conditional on PERF_EVENTS || TRACING
  module: Optimize __module_address() using a latched RB-tree
  rbtree: Implement generic latch_tree
  seqlock: Introduce raw_read_seqcount_latch()
  ...
2015-07-01 10:49:25 -07:00
shengyong
669d3d1233 UBI: Remove unnecessary `\'
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-03 09:41:45 +02:00
Takashi Iwai
53cd255ce7 UBI: Use static class and attribute groups
This patch cleans up the manual device_create_file() or
class_create_file() calls by replacing with static attribute groups.
It simplifies the code and also avoids the possible races between the
device/class registration and sysfs creations.

For the simplification, also make ubi_class a static instance with
initializers, too.

Amend a bit by Hujianyang.

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Tested-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-02 13:16:25 +02:00
shengyong
2848594a20 UBI: add a helper function for updatting on-flash layout volumes
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-02 11:47:51 +02:00
shengyong
e96a8a3bb6 UBI: Fastmap: Do not add vol if it already exists
During fastmap attaching, check if a volume already exists when adding
the volume to volume tree. NOTE that the issue cannot happen, only if
the on-flash fastmap data is modified.

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-02 11:46:14 +02:00
shengyong
e8d266cf8d UBI: Init vol->reserved_pebs by assignment
`vol' is a newly allocated value by kzalloc. Initialize it by assignment
instead of `+='.

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-02 11:44:43 +02:00
shengyong
f6e951af34 UBI: Fastmap: Rename variables to make them meaningful
s/fmpl1/fmpl
s/fmpl2/fmpl_wl

Add "WL" to the error message when wrong WL pool magic number is detected.

Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-02 11:44:20 +02:00
shengyong
a18fd67267 UBI: Fastmap: Remove unnecessary `\'
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-02 11:43:52 +02:00
shengyong
212240dfd2 UBI: Fastmap: Use max() to get the larger value
Signed-off-by: Sheng Yong <shengyong1@huawei.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-02 11:43:33 +02:00
Dan Ehrenberg
2bf50d42f3 UBI: block: Dynamically allocate minor numbers
This patch makes ubiblock devices have minor numbers beginning from
0, allocated dynamically independently of the ubi device/volume
number. This property becomes useful because, on 32-bit architectures
with LFS turned off in a userspace program, device minor numbers
over 8 bits cause stat to return -EOVERFLOW. If the device number is
high (>1) due to multiple MTD partitions, such an overflow will occur.
While enabling LFS is clearly a nicer solution, it's often difficult
to turn on in practice globally as many widely distributed packages
don't work with LFS on.

Other storage systems have their own workarounds, with SCSI making
multiple device majors and MMC having a config option for the number
of partitions per device. A completely dynamic minor numbering is
simpler than these. It is unlikely that anyone is depending on a
static minor number since the major is dynamic anyway. In addition,
ubiblock is still relatively new, so now is the time to make such
changes.

Signed-off-by: Dan Ehrenberg <dehrenberg@chromium.org>
Acked-by: Ezequiel Garcia <ezequiel@vanguardiasur.com.ar>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-06-02 11:35:49 +02:00
Luis R. Rodriguez
9c27847dda kernel/params: constify struct kernel_param_ops uses
Most code already uses consts for the struct kernel_param_ops,
sweep the kernel for the last offending stragglers. Other than
include/linux/moduleparam.h and kernel/params.c all other changes
were generated with the following Coccinelle SmPL patch. Merge
conflicts between trees can be handled with Coccinelle.

In the future git could get Coccinelle merge support to deal with
patch --> fail --> grammar --> Coccinelle --> new patch conflicts
automatically for us on patches where the grammar is available and
the patch is of high confidence. Consider this a feature request.

Test compiled on x86_64 against:

	* allnoconfig
	* allmodconfig
	* allyesconfig

@ const_found @
identifier ops;
@@

const struct kernel_param_ops ops = {
};

@ const_not_found depends on !const_found @
identifier ops;
@@

-struct kernel_param_ops ops = {
+const struct kernel_param_ops ops = {
};

Generated-by: Coccinelle SmPL
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Junio C Hamano <gitster@pobox.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: cocci@systeme.lip6.fr
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-05-28 11:32:10 +09:30
Kevin Cernekee
98fb1ffd81 UBI: block: Add missing cache flushes
Block drivers are responsible for calling flush_dcache_page() on each
BIO request. This operation keeps the I$ coherent with the D$ on
architectures that don't have hardware coherency support. Without this
flush, random crashes are seen when executing user programs from an ext4
filesystem backed by a ubiblock device.

This patch is based on the change implemented in commit 2d4dc890b5
("block: add helpers to run flush_dcache_page() against a bio and a
request's pages").

Fixes: 9d54c8a33e ("UBI: R/O block driver on top of UBI volumes")
Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Signed-off-by: Ezequiel Garcia <ezequiel.garcia@imgtec.com>
Signed-off-by: Richard Weinberger <richard@nod.at>
2015-05-06 22:52:22 +02:00
Linus Torvalds
9ec3a646fe Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull fourth vfs update from Al Viro:
 "d_inode() annotations from David Howells (sat in for-next since before
  the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RCU pathwalk breakage when running into a symlink overmounting something
  fix I_DIO_WAKEUP definition
  direct-io: only inc/dec inode->i_dio_count for file systems
  fs/9p: fix readdir()
  VFS: assorted d_backing_inode() annotations
  VFS: fs/inode.c helpers: d_inode() annotations
  VFS: fs/cachefiles: d_backing_inode() annotations
  VFS: fs library helpers: d_inode() annotations
  VFS: assorted weird filesystems: d_inode() annotations
  VFS: normal filesystems (and lustre): d_inode() annotations
  VFS: security/: d_inode() annotations
  VFS: security/: d_backing_inode() annotations
  VFS: net/: d_inode() annotations
  VFS: net/unix: d_backing_inode() annotations
  VFS: kernel/: d_inode() annotations
  VFS: audit: d_backing_inode() annotations
  VFS: Fix up some ->d_inode accesses in the chelsio driver
  VFS: Cachefiles should perform fs modifications on the top layer only
  VFS: AF_UNIX sockets should call mknod on the top layer only
2015-04-26 17:22:07 -07:00