License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 21:07:57 +07:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0 */
|
2005-04-17 05:20:36 +07:00
|
|
|
#ifndef _LINUX_GENHD_H
|
|
|
|
#define _LINUX_GENHD_H
|
|
|
|
|
|
|
|
/*
|
|
|
|
* genhd.h Copyright (C) 1992 Drew Eckhardt
|
|
|
|
* Generic hard disk header file by
|
|
|
|
* Drew Eckhardt
|
|
|
|
*
|
|
|
|
* <drew@colorado.edu>
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include <linux/types.h>
|
2007-05-22 03:08:01 +07:00
|
|
|
#include <linux/kdev_t.h>
|
2008-09-03 14:03:02 +07:00
|
|
|
#include <linux/rcupdate.h>
|
2010-09-01 03:47:05 +07:00
|
|
|
#include <linux/slab.h>
|
2015-07-16 10:16:45 +07:00
|
|
|
#include <linux/percpu-refcount.h>
|
2016-05-21 07:01:24 +07:00
|
|
|
#include <linux/uuid.h>
|
2018-07-18 18:47:38 +07:00
|
|
|
#include <linux/blk_types.h>
|
2018-12-06 23:41:20 +07:00
|
|
|
#include <asm/local.h>
|
2005-04-17 05:20:36 +07:00
|
|
|
|
[PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-10-01 01:45:40 +07:00
|
|
|
#ifdef CONFIG_BLOCK
|
|
|
|
|
2008-08-29 14:01:47 +07:00
|
|
|
#define dev_to_disk(device) container_of((device), struct gendisk, part0.__dev)
|
2008-08-25 17:56:05 +07:00
|
|
|
#define dev_to_part(device) container_of((device), struct hd_struct, __dev)
|
2008-08-29 14:01:47 +07:00
|
|
|
#define disk_to_dev(disk) (&(disk)->part0.__dev)
|
2008-08-25 17:56:05 +07:00
|
|
|
#define part_to_dev(part) (&((part)->__dev))
|
2007-05-22 03:08:01 +07:00
|
|
|
|
|
|
|
extern struct device_type part_type;
|
|
|
|
extern struct kobject *block_depr;
|
|
|
|
extern struct class block_class;
|
|
|
|
|
2008-08-25 17:56:16 +07:00
|
|
|
#define DISK_MAX_PARTS 256
|
2008-08-25 17:56:17 +07:00
|
|
|
#define DISK_NAME_LEN 32
|
2008-08-25 17:56:16 +07:00
|
|
|
|
2006-04-25 20:07:57 +07:00
|
|
|
#include <linux/major.h>
|
|
|
|
#include <linux/device.h>
|
|
|
|
#include <linux/smp.h>
|
|
|
|
#include <linux/string.h>
|
|
|
|
#include <linux/fs.h>
|
2007-05-24 03:57:38 +07:00
|
|
|
#include <linux/workqueue.h>
|
2006-04-25 20:07:57 +07:00
|
|
|
|
2008-02-08 17:04:09 +07:00
|
|
|
struct disk_stats {
|
2018-09-22 06:44:34 +07:00
|
|
|
u64 nsecs[NR_STAT_GROUPS];
|
2018-07-18 18:47:38 +07:00
|
|
|
unsigned long sectors[NR_STAT_GROUPS];
|
|
|
|
unsigned long ios[NR_STAT_GROUPS];
|
|
|
|
unsigned long merges[NR_STAT_GROUPS];
|
2008-02-08 17:04:09 +07:00
|
|
|
unsigned long io_ticks;
|
2018-12-06 23:41:20 +07:00
|
|
|
local_t in_flight[2];
|
2008-02-08 17:04:09 +07:00
|
|
|
};
|
2010-09-01 03:47:05 +07:00
|
|
|
|
|
|
|
#define PARTITION_META_INFO_VOLNAMELTH 64
|
2012-11-09 07:12:25 +07:00
|
|
|
/*
|
|
|
|
* Enough for the string representation of any kind of UUID plus NULL.
|
|
|
|
* EFI UUID is 36 characters. MSDOS UUID is 11 characters.
|
|
|
|
*/
|
2016-05-21 07:01:24 +07:00
|
|
|
#define PARTITION_META_INFO_UUIDLTH (UUID_STRING_LEN + 1)
|
2010-09-01 03:47:05 +07:00
|
|
|
|
|
|
|
struct partition_meta_info {
|
2012-11-09 07:12:25 +07:00
|
|
|
char uuid[PARTITION_META_INFO_UUIDLTH];
|
2010-09-01 03:47:05 +07:00
|
|
|
u8 volname[PARTITION_META_INFO_VOLNAMELTH];
|
|
|
|
};
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
struct hd_struct {
|
|
|
|
sector_t start_sect;
|
2012-08-01 17:24:18 +07:00
|
|
|
/*
|
|
|
|
* nr_sects is protected by sequence counter. One might extend a
|
|
|
|
* partition while IO is happening to it and update of nr_sects
|
|
|
|
* can be non-atomic on 32bit machines with 64bit sector_t.
|
|
|
|
*/
|
2005-04-17 05:20:36 +07:00
|
|
|
sector_t nr_sects;
|
2012-08-01 17:24:18 +07:00
|
|
|
seqcount_t nr_sects_seq;
|
2009-05-23 04:17:53 +07:00
|
|
|
sector_t alignment_offset;
|
2011-05-30 12:42:51 +07:00
|
|
|
unsigned int discard_alignment;
|
2008-08-25 17:56:05 +07:00
|
|
|
struct device __dev;
|
2006-03-27 16:17:55 +07:00
|
|
|
struct kobject *holder_dir;
|
2005-04-17 05:20:36 +07:00
|
|
|
int policy, partno;
|
2010-09-01 03:47:05 +07:00
|
|
|
struct partition_meta_info *info;
|
2006-12-08 17:39:46 +07:00
|
|
|
#ifdef CONFIG_FAIL_MAKE_REQUEST
|
|
|
|
int make_it_fail;
|
2008-02-08 17:04:09 +07:00
|
|
|
#endif
|
|
|
|
unsigned long stamp;
|
|
|
|
#ifdef CONFIG_SMP
|
2010-02-02 12:38:57 +07:00
|
|
|
struct disk_stats __percpu *dkstats;
|
2008-02-08 17:04:09 +07:00
|
|
|
#else
|
|
|
|
struct disk_stats dkstats;
|
2006-12-08 17:39:46 +07:00
|
|
|
#endif
|
2015-07-16 10:16:45 +07:00
|
|
|
struct percpu_ref ref;
|
block: use rcu_work instead of call_rcu to avoid sleep in softirq
We recently got a stack by syzkaller like this:
BUG: sleeping function called from invalid context at mm/slab.h:361
in_atomic(): 1, irqs_disabled(): 0, pid: 6644, name: blkid
INFO: lockdep is turned off.
CPU: 1 PID: 6644 Comm: blkid Not tainted 4.4.163-514.55.6.9.x86_64+ #76
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
0000000000000000 5ba6a6b879e50c00 ffff8801f6b07b10 ffffffff81cb2194
0000000041b58ab3 ffffffff833c7745 ffffffff81cb2080 5ba6a6b879e50c00
0000000000000000 0000000000000001 0000000000000004 0000000000000000
Call Trace:
<IRQ> [<ffffffff81cb2194>] __dump_stack lib/dump_stack.c:15 [inline]
<IRQ> [<ffffffff81cb2194>] dump_stack+0x114/0x1a0 lib/dump_stack.c:51
[<ffffffff8129a981>] ___might_sleep+0x291/0x490 kernel/sched/core.c:7675
[<ffffffff8129ac33>] __might_sleep+0xb3/0x270 kernel/sched/core.c:7637
[<ffffffff81794c13>] slab_pre_alloc_hook mm/slab.h:361 [inline]
[<ffffffff81794c13>] slab_alloc_node mm/slub.c:2610 [inline]
[<ffffffff81794c13>] slab_alloc mm/slub.c:2692 [inline]
[<ffffffff81794c13>] kmem_cache_alloc_trace+0x2c3/0x5c0 mm/slub.c:2709
[<ffffffff81cbe9a7>] kmalloc include/linux/slab.h:479 [inline]
[<ffffffff81cbe9a7>] kzalloc include/linux/slab.h:623 [inline]
[<ffffffff81cbe9a7>] kobject_uevent_env+0x2c7/0x1150 lib/kobject_uevent.c:227
[<ffffffff81cbf84f>] kobject_uevent+0x1f/0x30 lib/kobject_uevent.c:374
[<ffffffff81cbb5b9>] kobject_cleanup lib/kobject.c:633 [inline]
[<ffffffff81cbb5b9>] kobject_release+0x229/0x440 lib/kobject.c:675
[<ffffffff81cbb0a2>] kref_sub include/linux/kref.h:73 [inline]
[<ffffffff81cbb0a2>] kref_put include/linux/kref.h:98 [inline]
[<ffffffff81cbb0a2>] kobject_put+0x72/0xd0 lib/kobject.c:692
[<ffffffff8216f095>] put_device+0x25/0x30 drivers/base/core.c:1237
[<ffffffff81c4cc34>] delete_partition_rcu_cb+0x1d4/0x2f0 block/partition-generic.c:232
[<ffffffff813c08bc>] __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
[<ffffffff813c08bc>] rcu_do_batch kernel/rcu/tree.c:2705 [inline]
[<ffffffff813c08bc>] invoke_rcu_callbacks kernel/rcu/tree.c:2973 [inline]
[<ffffffff813c08bc>] __rcu_process_callbacks kernel/rcu/tree.c:2940 [inline]
[<ffffffff813c08bc>] rcu_process_callbacks+0x59c/0x1c70 kernel/rcu/tree.c:2957
[<ffffffff8120f509>] __do_softirq+0x299/0xe20 kernel/softirq.c:273
[<ffffffff81210496>] invoke_softirq kernel/softirq.c:350 [inline]
[<ffffffff81210496>] irq_exit+0x216/0x2c0 kernel/softirq.c:391
[<ffffffff82c2cd7b>] exiting_irq arch/x86/include/asm/apic.h:652 [inline]
[<ffffffff82c2cd7b>] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926
[<ffffffff82c2bc25>] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:746
<EOI> [<ffffffff814cbf40>] ? audit_kill_trees+0x180/0x180
[<ffffffff8187d2f7>] fd_install+0x57/0x80 fs/file.c:626
[<ffffffff8180989e>] do_sys_open+0x45e/0x550 fs/open.c:1043
[<ffffffff818099c2>] SYSC_open fs/open.c:1055 [inline]
[<ffffffff818099c2>] SyS_open+0x32/0x40 fs/open.c:1050
[<ffffffff82c299e1>] entry_SYSCALL_64_fastpath+0x1e/0x9a
In softirq context, we call rcu callback function delete_partition_rcu_cb(),
which may allocate memory by kzalloc with GFP_KERNEL flag. If the
allocation cannot be satisfied, it may sleep. However, That is not allowed
in softirq contex.
Although we found this problem on linux 4.4, the latest kernel version
seems to have this problem as well. And it is very similar to the
previous one:
https://lkml.org/lkml/2018/7/9/391
Fix it by using RCU workqueue, which allows sleep.
Reviewed-by: Paul E. McKenney <paulmck@linux.ibm.com>
Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-11-28 15:42:01 +07:00
|
|
|
struct rcu_work rcu_work;
|
2005-04-17 05:20:36 +07:00
|
|
|
};
|
|
|
|
|
2020-03-07 21:56:59 +07:00
|
|
|
/**
|
|
|
|
* DOC: genhd capability flags
|
|
|
|
*
|
|
|
|
* ``GENHD_FL_REMOVABLE`` (0x0001): indicates that the block device
|
|
|
|
* gives access to removable media.
|
|
|
|
* When set, the device remains present even when media is not
|
|
|
|
* inserted.
|
|
|
|
* Must not be set for devices which are removed entirely when the
|
|
|
|
* media is removed.
|
|
|
|
*
|
|
|
|
* ``GENHD_FL_CD`` (0x0008): the block device is a CD-ROM-style
|
|
|
|
* device.
|
|
|
|
* Affects responses to the ``CDROM_GET_CAPABILITY`` ioctl.
|
|
|
|
*
|
|
|
|
* ``GENHD_FL_UP`` (0x0010): indicates that the block device is "up",
|
|
|
|
* with a similar meaning to network interfaces.
|
|
|
|
*
|
|
|
|
* ``GENHD_FL_SUPPRESS_PARTITION_INFO`` (0x0020): don't include
|
|
|
|
* partition information in ``/proc/partitions`` or in the output of
|
|
|
|
* printk_all_partitions().
|
|
|
|
* Used for the null block device and some MMC devices.
|
|
|
|
*
|
|
|
|
* ``GENHD_FL_EXT_DEVT`` (0x0040): the driver supports extended
|
|
|
|
* dynamic ``dev_t``, i.e. it wants extended device numbers
|
|
|
|
* (``BLOCK_EXT_MAJOR``).
|
|
|
|
* This affects the maximum number of partitions.
|
|
|
|
*
|
|
|
|
* ``GENHD_FL_NATIVE_CAPACITY`` (0x0080): based on information in the
|
|
|
|
* partition table, the device's capacity has been extended to its
|
|
|
|
* native capacity; i.e. the device has hidden capacity used by one
|
|
|
|
* of the partitions (this is a flag used so that native capacity is
|
|
|
|
* only ever unlocked once).
|
|
|
|
*
|
|
|
|
* ``GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE`` (0x0100): event polling is
|
|
|
|
* blocked whenever a writer holds an exclusive lock.
|
|
|
|
*
|
|
|
|
* ``GENHD_FL_NO_PART_SCAN`` (0x0200): partition scanning is disabled.
|
|
|
|
* Used for loop devices in their default settings and some MMC
|
|
|
|
* devices.
|
|
|
|
*
|
|
|
|
* ``GENHD_FL_HIDDEN`` (0x0400): the block device is hidden; it
|
|
|
|
* doesn't produce events, doesn't appear in sysfs, and doesn't have
|
|
|
|
* an associated ``bdev``.
|
|
|
|
* Implies ``GENHD_FL_SUPPRESS_PARTITION_INFO`` and
|
|
|
|
* ``GENHD_FL_NO_PART_SCAN``.
|
|
|
|
* Used for multipath devices.
|
|
|
|
*/
|
|
|
|
#define GENHD_FL_REMOVABLE 0x0001
|
|
|
|
/* 2 is unused (used to be GENHD_FL_DRIVERFS) */
|
|
|
|
/* 4 is unused (used to be GENHD_FL_MEDIA_CHANGE_NOTIFY) */
|
|
|
|
#define GENHD_FL_CD 0x0008
|
|
|
|
#define GENHD_FL_UP 0x0010
|
|
|
|
#define GENHD_FL_SUPPRESS_PARTITION_INFO 0x0020
|
|
|
|
#define GENHD_FL_EXT_DEVT 0x0040
|
|
|
|
#define GENHD_FL_NATIVE_CAPACITY 0x0080
|
|
|
|
#define GENHD_FL_BLOCK_EVENTS_ON_EXCL_WRITE 0x0100
|
|
|
|
#define GENHD_FL_NO_PART_SCAN 0x0200
|
|
|
|
#define GENHD_FL_HIDDEN 0x0400
|
2005-04-17 05:20:36 +07:00
|
|
|
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 02:57:37 +07:00
|
|
|
enum {
|
|
|
|
DISK_EVENT_MEDIA_CHANGE = 1 << 0, /* media changed */
|
|
|
|
DISK_EVENT_EJECT_REQUEST = 1 << 1, /* eject requested */
|
|
|
|
};
|
|
|
|
|
2019-03-27 20:51:02 +07:00
|
|
|
enum {
|
|
|
|
/* Poll even if events_poll_msecs is unset */
|
|
|
|
DISK_EVENT_FLAG_POLL = 1 << 0,
|
|
|
|
/* Forward events to udev */
|
|
|
|
DISK_EVENT_FLAG_UEVENT = 1 << 1,
|
|
|
|
};
|
|
|
|
|
2008-08-25 17:56:15 +07:00
|
|
|
struct disk_part_tbl {
|
|
|
|
struct rcu_head rcu_head;
|
|
|
|
int len;
|
2010-02-25 02:01:56 +07:00
|
|
|
struct hd_struct __rcu *last_lookup;
|
|
|
|
struct hd_struct __rcu *part[];
|
2008-08-25 17:56:15 +07:00
|
|
|
};
|
|
|
|
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 02:57:37 +07:00
|
|
|
struct disk_events;
|
2016-01-09 23:36:51 +07:00
|
|
|
struct badblocks;
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 02:57:37 +07:00
|
|
|
|
2015-10-22 00:19:49 +07:00
|
|
|
#if defined(CONFIG_BLK_DEV_INTEGRITY)
|
|
|
|
|
|
|
|
struct blk_integrity {
|
2017-03-25 08:03:48 +07:00
|
|
|
const struct blk_integrity_profile *profile;
|
|
|
|
unsigned char flags;
|
|
|
|
unsigned char tuple_size;
|
|
|
|
unsigned char interval_exp;
|
|
|
|
unsigned char tag_size;
|
2015-10-22 00:19:49 +07:00
|
|
|
};
|
|
|
|
|
|
|
|
#endif /* CONFIG_BLK_DEV_INTEGRITY */
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
struct gendisk {
|
2008-08-25 17:56:16 +07:00
|
|
|
/* major, first_minor and minors are input parameters only,
|
|
|
|
* don't use directly. Use disk_devt() and disk_max_parts().
|
2008-09-03 14:01:48 +07:00
|
|
|
*/
|
2005-04-17 05:20:36 +07:00
|
|
|
int major; /* major number of driver */
|
|
|
|
int first_minor;
|
|
|
|
int minors; /* maximum number of minors, =1 for
|
|
|
|
* disks that can't be partitioned. */
|
2008-09-03 14:01:48 +07:00
|
|
|
|
2008-08-25 17:56:17 +07:00
|
|
|
char disk_name[DISK_NAME_LEN]; /* name of major driver */
|
2011-07-24 07:24:48 +07:00
|
|
|
char *(*devnode)(struct gendisk *gd, umode_t *mode);
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 02:57:37 +07:00
|
|
|
|
2019-03-27 20:51:02 +07:00
|
|
|
unsigned short events; /* supported events */
|
|
|
|
unsigned short event_flags; /* flags related to event processing */
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 02:57:37 +07:00
|
|
|
|
2008-09-03 14:06:42 +07:00
|
|
|
/* Array of pointers to partitions indexed by partno.
|
2008-09-03 14:03:02 +07:00
|
|
|
* Protected with matching bdev lock but stat and other
|
|
|
|
* non-critical accesses use RCU. Always access through
|
|
|
|
* helpers.
|
|
|
|
*/
|
2010-02-25 02:01:56 +07:00
|
|
|
struct disk_part_tbl __rcu *part_tbl;
|
2008-09-03 14:06:42 +07:00
|
|
|
struct hd_struct part0;
|
2008-09-03 14:03:02 +07:00
|
|
|
|
2009-09-22 07:01:13 +07:00
|
|
|
const struct block_device_operations *fops;
|
2005-04-17 05:20:36 +07:00
|
|
|
struct request_queue *queue;
|
|
|
|
void *private_data;
|
|
|
|
|
|
|
|
int flags;
|
2018-02-26 19:01:41 +07:00
|
|
|
struct rw_semaphore lookup_sem;
|
2006-03-27 16:17:55 +07:00
|
|
|
struct kobject *slave_dir;
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
struct timer_rand_state *random;
|
|
|
|
atomic_t sync_io; /* RAID */
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 02:57:37 +07:00
|
|
|
struct disk_events *ev;
|
2008-07-01 01:04:41 +07:00
|
|
|
#ifdef CONFIG_BLK_DEV_INTEGRITY
|
2015-10-22 00:19:27 +07:00
|
|
|
struct kobject integrity_kobj;
|
2015-10-22 00:19:49 +07:00
|
|
|
#endif /* CONFIG_BLK_DEV_INTEGRITY */
|
2008-08-25 17:56:15 +07:00
|
|
|
int node_id;
|
2016-01-09 23:36:51 +07:00
|
|
|
struct badblocks *bb;
|
2017-10-25 15:56:05 +07:00
|
|
|
struct lockdep_map lockdep_map;
|
2005-04-17 05:20:36 +07:00
|
|
|
};
|
|
|
|
|
2008-08-25 17:47:17 +07:00
|
|
|
static inline struct gendisk *part_to_disk(struct hd_struct *part)
|
|
|
|
{
|
2008-08-29 14:01:47 +07:00
|
|
|
if (likely(part)) {
|
|
|
|
if (part->partno)
|
|
|
|
return dev_to_disk(part_to_dev(part)->parent);
|
|
|
|
else
|
|
|
|
return dev_to_disk(part_to_dev(part));
|
|
|
|
}
|
2008-08-25 17:47:17 +07:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2008-09-03 14:01:48 +07:00
|
|
|
static inline int disk_max_parts(struct gendisk *disk)
|
|
|
|
{
|
2008-08-25 17:56:16 +07:00
|
|
|
if (disk->flags & GENHD_FL_EXT_DEVT)
|
|
|
|
return DISK_MAX_PARTS;
|
|
|
|
return disk->minors;
|
2008-09-03 14:06:42 +07:00
|
|
|
}
|
|
|
|
|
2011-08-24 01:01:04 +07:00
|
|
|
static inline bool disk_part_scan_enabled(struct gendisk *disk)
|
2008-09-03 14:06:42 +07:00
|
|
|
{
|
2011-08-24 01:01:04 +07:00
|
|
|
return disk_max_parts(disk) > 1 &&
|
|
|
|
!(disk->flags & GENHD_FL_NO_PART_SCAN);
|
2008-09-03 14:01:48 +07:00
|
|
|
}
|
|
|
|
|
2020-01-26 20:05:43 +07:00
|
|
|
static inline bool disk_has_partitions(struct gendisk *disk)
|
|
|
|
{
|
|
|
|
bool ret = false;
|
|
|
|
|
|
|
|
rcu_read_lock();
|
|
|
|
if (rcu_dereference(disk->part_tbl)->len > 1)
|
|
|
|
ret = true;
|
|
|
|
rcu_read_unlock();
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2008-09-03 14:01:48 +07:00
|
|
|
static inline dev_t disk_devt(struct gendisk *disk)
|
|
|
|
{
|
2017-11-03 01:29:52 +07:00
|
|
|
return MKDEV(disk->major, disk->first_minor);
|
2008-09-03 14:01:48 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline dev_t part_devt(struct hd_struct *part)
|
|
|
|
{
|
2008-08-25 17:56:05 +07:00
|
|
|
return part_to_dev(part)->devt;
|
2008-09-03 14:01:48 +07:00
|
|
|
}
|
|
|
|
|
2017-10-25 00:21:48 +07:00
|
|
|
extern struct hd_struct *__disk_get_part(struct gendisk *disk, int partno);
|
2008-09-03 14:03:02 +07:00
|
|
|
extern struct hd_struct *disk_get_part(struct gendisk *disk, int partno);
|
|
|
|
|
|
|
|
static inline void disk_put_part(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
if (likely(part))
|
2008-08-25 17:56:05 +07:00
|
|
|
put_device(part_to_dev(part));
|
2008-09-03 14:03:02 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Smarter partition iterator without context limits.
|
|
|
|
*/
|
|
|
|
#define DISK_PITER_REVERSE (1 << 0) /* iterate in the reverse direction */
|
|
|
|
#define DISK_PITER_INCL_EMPTY (1 << 1) /* include 0-sized parts */
|
2008-09-03 14:06:42 +07:00
|
|
|
#define DISK_PITER_INCL_PART0 (1 << 2) /* include partition 0 */
|
2009-04-17 13:34:48 +07:00
|
|
|
#define DISK_PITER_INCL_EMPTY_PART0 (1 << 3) /* include empty partition 0 */
|
2008-09-03 14:03:02 +07:00
|
|
|
|
|
|
|
struct disk_part_iter {
|
|
|
|
struct gendisk *disk;
|
|
|
|
struct hd_struct *part;
|
|
|
|
int idx;
|
|
|
|
unsigned int flags;
|
|
|
|
};
|
|
|
|
|
|
|
|
extern void disk_part_iter_init(struct disk_part_iter *piter,
|
|
|
|
struct gendisk *disk, unsigned int flags);
|
|
|
|
extern struct hd_struct *disk_part_iter_next(struct disk_part_iter *piter);
|
|
|
|
extern void disk_part_iter_exit(struct disk_part_iter *piter);
|
|
|
|
|
|
|
|
extern struct hd_struct *disk_map_sector_rcu(struct gendisk *disk,
|
|
|
|
sector_t sector);
|
|
|
|
|
2008-08-25 17:47:21 +07:00
|
|
|
/*
|
2005-04-17 05:20:36 +07:00
|
|
|
* Macros to operate on percpu disk statistics:
|
|
|
|
*
|
2008-08-25 17:47:21 +07:00
|
|
|
* {disk|part|all}_stat_{add|sub|inc|dec}() modify the stat counters
|
|
|
|
* and should be called between disk_stat_lock() and
|
|
|
|
* disk_stat_unlock().
|
|
|
|
*
|
|
|
|
* part_stat_read() can be called at any time.
|
|
|
|
*
|
|
|
|
* part_stat_{add|set_all}() and {init|free}_part_stats are for
|
|
|
|
* internal use only.
|
2005-04-17 05:20:36 +07:00
|
|
|
*/
|
|
|
|
#ifdef CONFIG_SMP
|
2008-08-25 17:56:14 +07:00
|
|
|
#define part_stat_lock() ({ rcu_read_lock(); get_cpu(); })
|
|
|
|
#define part_stat_unlock() do { put_cpu(); rcu_read_unlock(); } while (0)
|
2008-02-08 17:04:09 +07:00
|
|
|
|
2018-12-06 23:41:20 +07:00
|
|
|
#define part_stat_get_cpu(part, field, cpu) \
|
|
|
|
(per_cpu_ptr((part)->dkstats, (cpu))->field)
|
|
|
|
|
|
|
|
#define part_stat_get(part, field) \
|
|
|
|
part_stat_get_cpu(part, field, smp_processor_id())
|
2008-02-08 17:04:09 +07:00
|
|
|
|
|
|
|
#define part_stat_read(part, field) \
|
|
|
|
({ \
|
2008-08-25 17:56:14 +07:00
|
|
|
typeof((part)->dkstats->field) res = 0; \
|
2010-01-07 06:45:55 +07:00
|
|
|
unsigned int _cpu; \
|
|
|
|
for_each_possible_cpu(_cpu) \
|
|
|
|
res += per_cpu_ptr((part)->dkstats, _cpu)->field; \
|
2008-02-08 17:04:09 +07:00
|
|
|
res; \
|
|
|
|
})
|
|
|
|
|
2008-05-07 15:15:46 +07:00
|
|
|
static inline void part_stat_set_all(struct hd_struct *part, int value)
|
|
|
|
{
|
2008-02-08 17:04:09 +07:00
|
|
|
int i;
|
2008-05-07 15:15:46 +07:00
|
|
|
|
2008-02-08 17:04:09 +07:00
|
|
|
for_each_possible_cpu(i)
|
|
|
|
memset(per_cpu_ptr(part->dkstats, i), value,
|
2008-05-07 15:15:46 +07:00
|
|
|
sizeof(struct disk_stats));
|
2008-02-08 17:04:09 +07:00
|
|
|
}
|
2008-08-25 17:47:21 +07:00
|
|
|
|
2008-08-25 17:56:14 +07:00
|
|
|
static inline int init_part_stats(struct hd_struct *part)
|
2008-02-08 17:04:09 +07:00
|
|
|
{
|
2008-08-25 17:56:14 +07:00
|
|
|
part->dkstats = alloc_percpu(struct disk_stats);
|
|
|
|
if (!part->dkstats)
|
|
|
|
return 0;
|
|
|
|
return 1;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
2008-02-08 17:04:09 +07:00
|
|
|
|
2008-08-25 17:56:14 +07:00
|
|
|
static inline void free_part_stats(struct hd_struct *part)
|
2008-02-08 17:04:09 +07:00
|
|
|
{
|
2008-08-25 17:56:14 +07:00
|
|
|
free_percpu(part->dkstats);
|
2008-02-08 17:04:09 +07:00
|
|
|
}
|
|
|
|
|
2008-08-25 17:56:14 +07:00
|
|
|
#else /* !CONFIG_SMP */
|
|
|
|
#define part_stat_lock() ({ rcu_read_lock(); 0; })
|
|
|
|
#define part_stat_unlock() rcu_read_unlock()
|
2008-08-25 17:47:21 +07:00
|
|
|
|
2018-12-06 23:41:20 +07:00
|
|
|
#define part_stat_get(part, field) ((part)->dkstats.field)
|
|
|
|
#define part_stat_get_cpu(part, field, cpu) part_stat_get(part, field)
|
|
|
|
#define part_stat_read(part, field) part_stat_get(part, field)
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2008-08-25 17:56:14 +07:00
|
|
|
static inline void part_stat_set_all(struct hd_struct *part, int value)
|
2005-04-17 05:20:36 +07:00
|
|
|
{
|
2008-08-25 17:56:14 +07:00
|
|
|
memset(&part->dkstats, value, sizeof(struct disk_stats));
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
2008-02-08 17:04:09 +07:00
|
|
|
|
|
|
|
static inline int init_part_stats(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void free_part_stats(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
}
|
|
|
|
|
2008-08-25 17:56:14 +07:00
|
|
|
#endif /* CONFIG_SMP */
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2018-07-18 18:47:37 +07:00
|
|
|
#define part_stat_read_accum(part, field) \
|
2018-07-18 18:47:38 +07:00
|
|
|
(part_stat_read(part, field[STAT_READ]) + \
|
2018-07-18 18:47:40 +07:00
|
|
|
part_stat_read(part, field[STAT_WRITE]) + \
|
|
|
|
part_stat_read(part, field[STAT_DISCARD]))
|
2018-07-18 18:47:37 +07:00
|
|
|
|
2018-12-06 23:41:20 +07:00
|
|
|
#define __part_stat_add(part, field, addnd) \
|
|
|
|
(part_stat_get(part, field) += (addnd))
|
|
|
|
|
2018-12-06 23:41:18 +07:00
|
|
|
#define part_stat_add(part, field, addnd) do { \
|
|
|
|
__part_stat_add((part), field, addnd); \
|
2008-08-25 17:56:14 +07:00
|
|
|
if ((part)->partno) \
|
2018-12-06 23:41:18 +07:00
|
|
|
__part_stat_add(&part_to_disk((part))->part0, \
|
2008-08-25 17:56:14 +07:00
|
|
|
field, addnd); \
|
|
|
|
} while (0)
|
2008-02-08 17:04:09 +07:00
|
|
|
|
2018-12-06 23:41:18 +07:00
|
|
|
#define part_stat_dec(gendiskp, field) \
|
|
|
|
part_stat_add(gendiskp, field, -1)
|
|
|
|
#define part_stat_inc(gendiskp, field) \
|
|
|
|
part_stat_add(gendiskp, field, 1)
|
|
|
|
#define part_stat_sub(gendiskp, field, subnd) \
|
|
|
|
part_stat_add(gendiskp, field, -subnd)
|
2008-08-25 17:56:14 +07:00
|
|
|
|
2018-12-06 23:41:20 +07:00
|
|
|
#define part_stat_local_dec(gendiskp, field) \
|
|
|
|
local_dec(&(part_stat_get(gendiskp, field)))
|
|
|
|
#define part_stat_local_inc(gendiskp, field) \
|
|
|
|
local_inc(&(part_stat_get(gendiskp, field)))
|
|
|
|
#define part_stat_local_read(gendiskp, field) \
|
|
|
|
local_read(&(part_stat_get(gendiskp, field)))
|
|
|
|
#define part_stat_local_read_cpu(gendiskp, field, cpu) \
|
|
|
|
local_read(&(part_stat_get_cpu(gendiskp, field, cpu)))
|
|
|
|
|
2018-12-06 23:41:21 +07:00
|
|
|
unsigned int part_in_flight(struct request_queue *q, struct hd_struct *part);
|
2018-04-26 14:21:59 +07:00
|
|
|
void part_in_flight_rw(struct request_queue *q, struct hd_struct *part,
|
|
|
|
unsigned int inflight[2]);
|
2017-08-09 06:51:45 +07:00
|
|
|
void part_dec_in_flight(struct request_queue *q, struct hd_struct *part,
|
|
|
|
int rw);
|
|
|
|
void part_inc_in_flight(struct request_queue *q, struct hd_struct *part,
|
|
|
|
int rw);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
block/diskstats: more accurate approximation of io_ticks for slow disks
Currently io_ticks is approximated by adding one at each start and end of
requests if jiffies counter has changed. This works perfectly for requests
shorter than a jiffy or if one of requests starts/ends at each jiffy.
If disk executes just one request at a time and they are longer than two
jiffies then only first and last jiffies will be accounted.
Fix is simple: at the end of request add up into io_ticks jiffies passed
since last update rather than just one jiffy.
Example: common HDD executes random read 4k requests around 12ms.
fio --name=test --filename=/dev/sdb --rw=randread --direct=1 --runtime=30 &
iostat -x 10 sdb
Note changes of iostat's "%util" 8,43% -> 99,99% before/after patch:
Before:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0,00 0,00 82,60 0,00 330,40 0,00 8,00 0,96 12,09 12,09 0,00 1,02 8,43
After:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdb 0,00 0,00 82,50 0,00 330,00 0,00 8,00 1,00 12,10 12,10 0,00 12,12 99,99
Now io_ticks does not loose time between start and end of requests, but
for queue-depth > 1 some I/O time between adjacent starts might be lost.
For load estimation "%util" is not as useful as average queue length,
but it clearly shows how often disk queue is completely empty.
Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-03-25 20:07:04 +07:00
|
|
|
void update_io_ticks(struct hd_struct *part, unsigned long now, bool end);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2009-03-10 14:25:54 +07:00
|
|
|
/* block/genhd.c */
|
2018-09-28 13:17:19 +07:00
|
|
|
extern void device_add_disk(struct device *parent, struct gendisk *disk,
|
|
|
|
const struct attribute_group **groups);
|
2016-06-16 08:17:27 +07:00
|
|
|
static inline void add_disk(struct gendisk *disk)
|
|
|
|
{
|
2018-09-28 13:17:19 +07:00
|
|
|
device_add_disk(NULL, disk, NULL);
|
2016-06-16 08:17:27 +07:00
|
|
|
}
|
2018-01-09 10:01:13 +07:00
|
|
|
extern void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk);
|
|
|
|
static inline void add_disk_no_queue_reg(struct gendisk *disk)
|
|
|
|
{
|
|
|
|
device_add_disk_no_queue_reg(NULL, disk);
|
|
|
|
}
|
2016-06-16 08:17:27 +07:00
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
extern void del_gendisk(struct gendisk *gp);
|
2008-09-03 14:01:09 +07:00
|
|
|
extern struct gendisk *get_gendisk(dev_t dev, int *partno);
|
2008-09-03 14:01:48 +07:00
|
|
|
extern struct block_device *bdget_disk(struct gendisk *disk, int partno);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
|
|
|
extern void set_device_ro(struct block_device *bdev, int flag);
|
|
|
|
extern void set_disk_ro(struct gendisk *disk, int flag);
|
|
|
|
|
2008-08-25 17:56:10 +07:00
|
|
|
static inline int get_disk_ro(struct gendisk *disk)
|
|
|
|
{
|
|
|
|
return disk->part0.policy;
|
|
|
|
}
|
|
|
|
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 02:57:37 +07:00
|
|
|
extern void disk_block_events(struct gendisk *disk);
|
|
|
|
extern void disk_unblock_events(struct gendisk *disk);
|
2011-07-01 21:17:47 +07:00
|
|
|
extern void disk_flush_events(struct gendisk *disk, unsigned int mask);
|
2020-03-13 12:30:05 +07:00
|
|
|
extern void set_capacity_revalidate_and_notify(struct gendisk *disk,
|
|
|
|
sector_t size, bool revalidate);
|
implement in-kernel gendisk events handling
Currently, media presence polling for removeable block devices is done
from userland. There are several issues with this.
* Polling is done by periodically opening the device. For SCSI
devices, the command sequence generated by such action involves a
few different commands including TEST_UNIT_READY. This behavior,
while perfectly legal, is different from Windows which only issues
single command, GET_EVENT_STATUS_NOTIFICATION. Unfortunately, some
ATAPI devices lock up after being periodically queried such command
sequences.
* There is no reliable and unintrusive way for a userland program to
tell whether the target device is safe for media presence polling.
For example, polling for media presence during an on-going burning
session can make it fail. The polling program can avoid this by
opening the device with O_EXCL but then it risks making a valid
exclusive user of the device fail w/ -EBUSY.
* Userland polling is unnecessarily heavy and in-kernel implementation
is lighter and better coordinated (workqueue, timer slack).
This patch implements framework for in-kernel disk event handling,
which includes media presence polling.
* bdops->check_events() is added, which supercedes ->media_changed().
It should check whether there's any pending event and return if so.
Currently, two events are defined - DISK_EVENT_MEDIA_CHANGE and
DISK_EVENT_EJECT_REQUEST. ->check_events() is guaranteed not to be
called parallelly.
* gendisk->events and ->async_events are added. These should be
initialized by block driver before passing the device to add_disk().
The former contains the mask of all supported events and the latter
the mask of all events which the device can report without polling.
/sys/block/*/events[_async] export these to userland.
* Kernel parameter block.events_dfl_poll_msecs controls the system
polling interval (default is 0 which means disable) and
/sys/block/*/events_poll_msecs control polling intervals for
individual devices (default is -1 meaning use system setting). Note
that if a device can report all supported events asynchronously and
its polling interval isn't explicitly set, the device won't be
polled regardless of the system polling interval.
* If a device is opened exclusively with write access, event checking
is automatically disabled until all write exclusive accesses are
released.
* There are event 'clearing' events. For example, both of currently
defined events are cleared after the device has been successfully
opened. This information is passed to ->check_events() callback
using @clearing argument as a hint.
* Event checking is always performed from system_nrt_wq and timer
slack is set to 25% for polling.
* Nothing changes for drivers which implement ->media_changed() but
not ->check_events(). Going forward, all drivers will be converted
to ->check_events() and ->media_change() will be dropped.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
2010-12-09 02:57:37 +07:00
|
|
|
extern unsigned int disk_clear_events(struct gendisk *disk, unsigned int mask);
|
|
|
|
|
2005-04-17 05:20:36 +07:00
|
|
|
/* drivers/char/random.c */
|
2016-06-21 01:42:34 +07:00
|
|
|
extern void add_disk_randomness(struct gendisk *disk) __latent_entropy;
|
2005-04-17 05:20:36 +07:00
|
|
|
extern void rand_initialize_disk(struct gendisk *disk);
|
|
|
|
|
|
|
|
static inline sector_t get_start_sect(struct block_device *bdev)
|
|
|
|
{
|
2008-08-25 17:56:12 +07:00
|
|
|
return bdev->bd_part->start_sect;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
static inline sector_t get_capacity(struct gendisk *disk)
|
|
|
|
{
|
2008-08-25 17:56:07 +07:00
|
|
|
return disk->part0.nr_sects;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
static inline void set_capacity(struct gendisk *disk, sector_t size)
|
|
|
|
{
|
2008-08-25 17:56:07 +07:00
|
|
|
disk->part0.nr_sects = size;
|
2005-04-17 05:20:36 +07:00
|
|
|
}
|
|
|
|
|
2007-02-11 14:50:00 +07:00
|
|
|
#define ADDPART_FLAG_NONE 0
|
|
|
|
#define ADDPART_FLAG_RAID 1
|
|
|
|
#define ADDPART_FLAG_WHOLEDISK 2
|
|
|
|
|
2008-08-25 17:47:22 +07:00
|
|
|
extern int blk_alloc_devt(struct hd_struct *part, dev_t *devt);
|
|
|
|
extern void blk_free_devt(dev_t devt);
|
2019-04-02 19:06:34 +07:00
|
|
|
extern void blk_invalidate_devt(dev_t devt);
|
2008-09-03 14:01:09 +07:00
|
|
|
extern dev_t blk_lookup_devt(const char *name, int partno);
|
|
|
|
extern char *disk_name (struct gendisk *hd, int partno, char *buf);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2019-11-14 21:34:35 +07:00
|
|
|
int bdev_disk_changed(struct block_device *bdev, bool invalidate);
|
2019-11-14 21:34:34 +07:00
|
|
|
int blk_add_partitions(struct gendisk *disk, struct block_device *bdev);
|
|
|
|
int blk_drop_partitions(struct gendisk *disk, struct block_device *bdev);
|
2008-08-25 17:56:15 +07:00
|
|
|
extern int disk_expand_part_tbl(struct gendisk *disk, int target);
|
2008-11-10 13:29:58 +07:00
|
|
|
extern struct hd_struct * __must_check add_partition(struct gendisk *disk,
|
|
|
|
int partno, sector_t start,
|
2010-09-01 03:47:05 +07:00
|
|
|
sector_t len, int flags,
|
|
|
|
struct partition_meta_info
|
|
|
|
*info);
|
2015-07-16 10:16:45 +07:00
|
|
|
extern void __delete_partition(struct percpu_ref *);
|
2005-04-17 05:20:36 +07:00
|
|
|
extern void delete_partition(struct gendisk *, int);
|
2007-05-09 16:33:24 +07:00
|
|
|
extern void printk_all_partitions(void);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2017-10-25 15:56:05 +07:00
|
|
|
extern struct gendisk *__alloc_disk_node(int minors, int node_id);
|
2018-02-26 19:01:38 +07:00
|
|
|
extern struct kobject *get_disk_and_module(struct gendisk *disk);
|
2005-04-17 05:20:36 +07:00
|
|
|
extern void put_disk(struct gendisk *disk);
|
2018-02-26 19:01:39 +07:00
|
|
|
extern void put_disk_and_module(struct gendisk *disk);
|
2007-05-22 03:08:01 +07:00
|
|
|
extern void blk_register_region(dev_t devt, unsigned long range,
|
2005-04-17 05:20:36 +07:00
|
|
|
struct module *module,
|
|
|
|
struct kobject *(*probe)(dev_t, int *, void *),
|
|
|
|
int (*lock)(dev_t, void *),
|
|
|
|
void *data);
|
2007-05-22 03:08:01 +07:00
|
|
|
extern void blk_unregister_region(dev_t devt, unsigned long range);
|
2005-04-17 05:20:36 +07:00
|
|
|
|
2017-10-25 15:56:05 +07:00
|
|
|
#define alloc_disk_node(minors, node_id) \
|
|
|
|
({ \
|
|
|
|
static struct lock_class_key __key; \
|
|
|
|
const char *__name; \
|
|
|
|
struct gendisk *__disk; \
|
|
|
|
\
|
|
|
|
__name = "(gendisk_completion)"#minors"("#node_id")"; \
|
|
|
|
\
|
|
|
|
__disk = __alloc_disk_node(minors, node_id); \
|
|
|
|
\
|
|
|
|
if (__disk) \
|
|
|
|
lockdep_init_map(&__disk->lockdep_map, __name, &__key, 0); \
|
|
|
|
\
|
|
|
|
__disk; \
|
|
|
|
})
|
|
|
|
|
|
|
|
#define alloc_disk(minors) alloc_disk_node(minors, NUMA_NO_NODE)
|
|
|
|
|
2015-07-16 10:16:45 +07:00
|
|
|
static inline int hd_ref_init(struct hd_struct *part)
|
2011-01-07 14:43:37 +07:00
|
|
|
{
|
2015-07-16 10:16:45 +07:00
|
|
|
if (percpu_ref_init(&part->ref, __delete_partition, 0,
|
|
|
|
GFP_KERNEL))
|
|
|
|
return -ENOMEM;
|
|
|
|
return 0;
|
2011-01-07 14:43:37 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline void hd_struct_get(struct hd_struct *part)
|
|
|
|
{
|
2015-07-16 10:16:45 +07:00
|
|
|
percpu_ref_get(&part->ref);
|
2011-01-07 14:43:37 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline int hd_struct_try_get(struct hd_struct *part)
|
|
|
|
{
|
2015-07-16 10:16:45 +07:00
|
|
|
return percpu_ref_tryget_live(&part->ref);
|
2011-01-07 14:43:37 +07:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline void hd_struct_put(struct hd_struct *part)
|
|
|
|
{
|
2015-07-16 10:16:45 +07:00
|
|
|
percpu_ref_put(&part->ref);
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void hd_struct_kill(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
percpu_ref_kill(&part->ref);
|
2011-01-07 14:43:37 +07:00
|
|
|
}
|
|
|
|
|
2015-07-16 10:16:44 +07:00
|
|
|
static inline void hd_free_part(struct hd_struct *part)
|
|
|
|
{
|
|
|
|
free_part_stats(part);
|
2020-03-24 14:25:14 +07:00
|
|
|
kfree(part->info);
|
2015-07-16 10:16:45 +07:00
|
|
|
percpu_ref_exit(&part->ref);
|
2015-07-16 10:16:44 +07:00
|
|
|
}
|
|
|
|
|
2012-08-01 17:24:18 +07:00
|
|
|
/*
|
|
|
|
* Any access of part->nr_sects which is not protected by partition
|
|
|
|
* bd_mutex or gendisk bdev bd_mutex, should be done using this
|
|
|
|
* accessor function.
|
|
|
|
*
|
|
|
|
* Code written along the lines of i_size_read() and i_size_write().
|
2019-10-16 02:18:10 +07:00
|
|
|
* CONFIG_PREEMPTION case optimizes the case of UP kernel with preemption
|
2012-08-01 17:24:18 +07:00
|
|
|
* on.
|
|
|
|
*/
|
|
|
|
static inline sector_t part_nr_sects_read(struct hd_struct *part)
|
|
|
|
{
|
2019-04-05 23:08:59 +07:00
|
|
|
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
|
2012-08-01 17:24:18 +07:00
|
|
|
sector_t nr_sects;
|
|
|
|
unsigned seq;
|
|
|
|
do {
|
|
|
|
seq = read_seqcount_begin(&part->nr_sects_seq);
|
|
|
|
nr_sects = part->nr_sects;
|
|
|
|
} while (read_seqcount_retry(&part->nr_sects_seq, seq));
|
|
|
|
return nr_sects;
|
2019-10-16 02:18:10 +07:00
|
|
|
#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
|
2012-08-01 17:24:18 +07:00
|
|
|
sector_t nr_sects;
|
|
|
|
|
|
|
|
preempt_disable();
|
|
|
|
nr_sects = part->nr_sects;
|
|
|
|
preempt_enable();
|
|
|
|
return nr_sects;
|
|
|
|
#else
|
|
|
|
return part->nr_sects;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Should be called with mutex lock held (typically bd_mutex) of partition
|
|
|
|
* to provide mutual exlusion among writers otherwise seqcount might be
|
|
|
|
* left in wrong state leaving the readers spinning infinitely.
|
|
|
|
*/
|
|
|
|
static inline void part_nr_sects_write(struct hd_struct *part, sector_t size)
|
|
|
|
{
|
2019-04-05 23:08:59 +07:00
|
|
|
#if BITS_PER_LONG==32 && defined(CONFIG_SMP)
|
2012-08-01 17:24:18 +07:00
|
|
|
write_seqcount_begin(&part->nr_sects_seq);
|
|
|
|
part->nr_sects = size;
|
|
|
|
write_seqcount_end(&part->nr_sects_seq);
|
2019-10-16 02:18:10 +07:00
|
|
|
#elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPTION)
|
2012-08-01 17:24:18 +07:00
|
|
|
preempt_disable();
|
|
|
|
part->nr_sects = size;
|
|
|
|
preempt_enable();
|
|
|
|
#else
|
|
|
|
part->nr_sects = size;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
2015-10-22 00:19:49 +07:00
|
|
|
#if defined(CONFIG_BLK_DEV_INTEGRITY)
|
|
|
|
extern void blk_integrity_add(struct gendisk *);
|
|
|
|
extern void blk_integrity_del(struct gendisk *);
|
|
|
|
#else /* CONFIG_BLK_DEV_INTEGRITY */
|
|
|
|
static inline void blk_integrity_add(struct gendisk *disk) { }
|
|
|
|
static inline void blk_integrity_del(struct gendisk *disk) { }
|
|
|
|
#endif /* CONFIG_BLK_DEV_INTEGRITY */
|
|
|
|
|
2007-05-11 18:29:54 +07:00
|
|
|
#else /* CONFIG_BLOCK */
|
|
|
|
|
|
|
|
static inline void printk_all_partitions(void) { }
|
|
|
|
|
2008-09-03 14:01:09 +07:00
|
|
|
static inline dev_t blk_lookup_devt(const char *name, int partno)
|
2007-05-22 03:08:01 +07:00
|
|
|
{
|
|
|
|
dev_t devt = MKDEV(0, 0);
|
|
|
|
return devt;
|
|
|
|
}
|
2007-05-11 18:29:54 +07:00
|
|
|
#endif /* CONFIG_BLOCK */
|
[PATCH] BLOCK: Make it possible to disable the block layer [try #6]
Make it possible to disable the block layer. Not all embedded devices require
it, some can make do with just JFFS2, NFS, ramfs, etc - none of which require
the block layer to be present.
This patch does the following:
(*) Introduces CONFIG_BLOCK to disable the block layer, buffering and blockdev
support.
(*) Adds dependencies on CONFIG_BLOCK to any configuration item that controls
an item that uses the block layer. This includes:
(*) Block I/O tracing.
(*) Disk partition code.
(*) All filesystems that are block based, eg: Ext3, ReiserFS, ISOFS.
(*) The SCSI layer. As far as I can tell, even SCSI chardevs use the
block layer to do scheduling. Some drivers that use SCSI facilities -
such as USB storage - end up disabled indirectly from this.
(*) Various block-based device drivers, such as IDE and the old CDROM
drivers.
(*) MTD blockdev handling and FTL.
(*) JFFS - which uses set_bdev_super(), something it could avoid doing by
taking a leaf out of JFFS2's book.
(*) Makes most of the contents of linux/blkdev.h, linux/buffer_head.h and
linux/elevator.h contingent on CONFIG_BLOCK being set. sector_div() is,
however, still used in places, and so is still available.
(*) Also made contingent are the contents of linux/mpage.h, linux/genhd.h and
parts of linux/fs.h.
(*) Makes a number of files in fs/ contingent on CONFIG_BLOCK.
(*) Makes mm/bounce.c (bounce buffering) contingent on CONFIG_BLOCK.
(*) set_page_dirty() doesn't call __set_page_dirty_buffers() if CONFIG_BLOCK
is not enabled.
(*) fs/no-block.c is created to hold out-of-line stubs and things that are
required when CONFIG_BLOCK is not set:
(*) Default blockdev file operations (to give error ENODEV on opening).
(*) Makes some /proc changes:
(*) /proc/devices does not list any blockdevs.
(*) /proc/diskstats and /proc/partitions are contingent on CONFIG_BLOCK.
(*) Makes some compat ioctl handling contingent on CONFIG_BLOCK.
(*) If CONFIG_BLOCK is not defined, makes sys_quotactl() return -ENODEV if
given command other than Q_SYNC or if a special device is specified.
(*) In init/do_mounts.c, no reference is made to the blockdev routines if
CONFIG_BLOCK is not defined. This does not prohibit NFS roots or JFFS2.
(*) The bdflush, ioprio_set and ioprio_get syscalls can now be absent (return
error ENOSYS by way of cond_syscall if so).
(*) The seclvl_bd_claim() and seclvl_bd_release() security calls do nothing if
CONFIG_BLOCK is not set, since they can't then happen.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2006-10-01 01:45:40 +07:00
|
|
|
|
2008-03-12 23:52:56 +07:00
|
|
|
#endif /* _LINUX_GENHD_H */
|