* drbd-8.3:
documentation: Documented detach's --force and disk's --disk-timeout
drbd: Implemented the disk-timeout option
drbd: Force flag for the detach operation
drbd: Allow new IOs while the local disk in in FAILED state
drbd: Bitmap IO functions can not return prematurely if the disk breaks
drbd: Added a kref to bm_aio_ctx
drbd: Hold a reference to ldev while doing meta-data IO
drbd: Keep a reference to the bio until the completion handler finished
drbd: Implemented wait_until_done_or_disk_failure()
drbd: Replaced md_io_mutex by an atomic: md_io_in_use
drbd: moved md_io into mdev
drbd: Immediately allow completion of IOs, that wait for IO completions on a failed disk
drbd: Keep a reference to barrier acked requests
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* commit 'ae57a0a':
drbd: Only print sanitize state's warnings, if the state change happens
drbd: we should write meta data updates with FLUSH FUA
drbd: fix limit define, we support 1 PiByte now
drbd: fix log message argument order
drbd: Typo in user-visible message.
drbd: Make "(rcv|snd)buf-size" and "ping-timeout" available for the proxy, too.
drbd: Allow keywords to be used in multiple config sections.
drbd: fix typos in comments.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Allow up to 300 centi-seconds to be configured for the "ping timeout".
There may be setups where heavy congestion, huge buffers, and asymmetric
bandwidth limitations may need a "huge" ping-timeout as work-around
for "spurious connection loss" problems.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
drbdadm already has a --dry-run option, so this option cannot directly be
passed through to drbdsetup. Rename the drbdsetup option to resolve this
conflict.
For backward compatibility, make --dry-run an alias of --tentative.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This is done by introducing drbd_nla_find_nested() which handles the flag
before calling nla_find_nested().
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
It is not "to small", but "too small".
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
We need to remove the flag before checking for valid types.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Make it more clear in the flag names which flags are internal to drbd, and
which are not.
The check for mandatory attributes is the only extension visible at the netlink
layer. Attributes with this flag set would look like unknown attributes to
some kernel versions. The netlink layer would ignore them and also skip
consistency checks on the attribute type and legth. To avoid this, we check
for mandatory attributes first, remove the mandatory flag, and then process the
attributes normally.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Note: All input values are still treated as signed; unsigned long long values
are still broken.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
These constants are useful for the same purpose in more than one place.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
* the peer does not speak protocol_version 100 and the
user wants to change one of:
- wire_protocol
- two_primaries
- integrity_alg
* the user wants to remove the allow_two_primaries flag
when there are two primaries
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
We don't have the units in constant names in other places, either.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The 32-bit resync_after netlink field takes a device minor number as
parameter, which is no longer limited to 255. We cannot statically
verify which device numbers are valid, so set the ummer limit to the
highest possible signed 32-bit integer.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This is what it is called in config files and on the command line as
well.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Change the --no-tcp-cork drbdsetup command line option as well as
the no_cork netlink packet.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Change the --no-md-flushes drbdsetup command line option as well as
the no_md_flush netlink packet.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Change the --no-disk-drain drbdsetup command line option as well as
the no_disk_drain netlink packet.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Change the --no-disk-flushes drbdsetup command line option as well as
the no_disk_flush netlink packet.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Flags of type NLA_FLAG are either present or absent, but do not have a
value by themselves. Use type NLA_U8 for our boolean flags instead, and
use the value to determine if the flag should be on or off.
On the drbdsetup command line, all those flags have an optional yes/no
argument which defaults to yes.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
...and drop explicit typecasts (int)meta_dev_idx < 0.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The 8 byte header finally becomes too small. With the protocol 100 header we
have 16 bit for the volume number, proper 32 bit for the data length, and
32 bit for further extensions in the future.
Previous versions of drbd are using version 80 headers for all packets
short enough for protocol 80. They support both header versions in
worker context, but only version 80 headers in asynchronous context.
For backwards compatibility, continue to use version 80 headers for
short packets before protocol version 100.
From protocol version 100 on, use the same header version for all
packets.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This commit breaks the API again.
Move per-volume former syncer options into disk_conf.
Move per-connection former syncer options into net_conf.
Renamed the remainign sync_conf to res_opts
Syncer settings have been changeable at runtime, so we need to prepare
for these settings to be runtime-changeable in their new home as well.
Introduce new configuration operations, and share the netlink attribute
between "attach" (create new disk) and "disk-opts" (change options).
Same for "connect" and "net-opts".
Some fields cannot be changed at runtime, however.
Introduce a new flag GENLA_F_INVARIANT to be able to trigger on that in
the generated validation and assignment functions.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This greatly simplifies deconfiguration of whole resources.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
This adds the new API header and helper files.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Old default behaviour was "pass-on",
which is not useful in production at all.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Up to now it only operated on minor numbers. Now it can work also
on named connections.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Use a new on-disk transaction format for the activity log, which allows
for multiple changes to the active set per transaction.
Using 4k transaction blocks, we can now get rid of the work-around code
to deal with devices not supporting 512 byte logical block size.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Allow multiple changes to the active set of elements in lru_cache.
The only current user of lru_cache, drbd, is driving this generalisation.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Some open-coded clear_bit(); smp_mb__after_clear_bit();
should in fact have been smp_mb__before_clear_bit(); clear_bit();
Instead, use clear_bit_unlock() to annotate the intention,
and have it do the right thing.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
The new header layout will only be used if the peer supports
it of course.
For the first packet and the handshake packet the old (h80)
layout is used for compatibility reasons.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Converting the constants happens at compile time.
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Automatic partition scanning can be requested individually per loop
device during its setup by setting LO_FLAGS_PARTSCAN. By default, no
partition tables are scanned.
Userspace can now always add and remove partitions from all loop
devices, regardless if the in-kernel partition scanner is enabled or
not.
The needed partition minor numbers are allocated from the extended
minors space, the main loop device numbers will continue to match the
loop minors, regardless of the number of partitions used.
# grep . /sys/class/block/loop1/loop/*
/sys/block/loop1/loop/autoclear:0
/sys/block/loop1/loop/backing_file:/home/kay/data/stuff/part.img
/sys/block/loop1/loop/offset:0
/sys/block/loop1/loop/partscan:1
/sys/block/loop1/loop/sizelimit:0
# ls -l /dev/loop*
brw-rw---- 1 root disk 7, 0 Aug 14 20:22 /dev/loop0
brw-rw---- 1 root disk 7, 1 Aug 14 20:23 /dev/loop1
brw-rw---- 1 root disk 259, 0 Aug 14 20:23 /dev/loop1p1
brw-rw---- 1 root disk 259, 1 Aug 14 20:23 /dev/loop1p2
brw-rw---- 1 root disk 7, 99 Aug 14 20:23 /dev/loop99
brw-rw---- 1 root disk 259, 2 Aug 14 20:23 /dev/loop99p1
brw-rw---- 1 root disk 259, 3 Aug 14 20:23 /dev/loop99p2
crw------T 1 root root 10, 237 Aug 14 20:22 /dev/loop-control
Cc: Karel Zak <kzak@redhat.com>
Cc: Davidlohr Bueso <dave@gnu.org>
Acked-By: Tejun Heo <tj@kernel.org>
Signed-off-by: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
There are cases where suppressing partition scan is useful - e.g. for
lo devices and pseudo SATA devices which advertise to be a disk but
get upset on partition scan (some port multiplier control devices show
such behavior).
This patch adds GENHD_FL_NO_PART_SCAN which suppresses partition scan
regardless of the number of possible partitions. disk_partitionable()
is renamed to disk_part_scan_enabled() as suppressing partition scan
doesn't imply the device can't be partitioned using
BLKPG_ADD/DEL_PARTITION calls from userland. show_partition() now
directly tests disk_max_parts() to maintain backward-compatibility.
-v2: Updated to make it clear that only partition scan is suppressed
not partitioning itself as suggested by Kay Sievers.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Kay Sievers <kay.sievers@vrfy.org>
Signed-off-by: Jens Axboe <jaxboe@fusionio.com>
task->cred is declared as __rcu, and access to other tasks' ->cred is,
indeed, protected. Access to current->cred does not need rcu_dereference()
at all, since only the task itself can change its ->cred. sparse, of
course, has no way of knowing that...
Add force-cast in current_cred(), make current_fsuid() et.al. use it.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>