linux_dsm_epyc7002/drivers/block
Lars Ellenberg dd4f699da6 drbd: when receiving P_TRIM, zero-out partial unaligned chunks
We can avoid spurious data divergence caused by partially-ignored
discards on certain backends with discard_zeroes_data=0, if we
translate partial unaligned discard requests into explicit zero-out.

The relevant use case is LVM/DM thin.

If on different nodes, DRBD is backed by devices with differing
discard characteristics, discards may lead to data divergence
(old data or garbage left over on one backend, zeroes due to
unmapped areas on the other backend). Online verify would now
potentially report tons of spurious differences.

While probably harmless for most use cases (fstrim on a file system),
DRBD cannot have that, it would violate our promise to upper layers
that our data instances on the nodes are identical.

To be correct and play safe (make sure data is identical on both copies),
we would have to disable discard support, if our local backend (on a
Primary) does not support "discard_zeroes_data=true".

We'd also have to translate discards to explicit zero-out on the
receiving (typically: Secondary) side, unless the receiving side
supports "discard_zeroes_data=true".

Which both would allocate those blocks, instead of unmapping them,
in contrast with expectations.

LVM/DM thin does set discard_zeroes_data=0,
because it silently ignores discards to partial chunks.

We can work around this by checking the alignment first.
For unaligned (wrt. alignment and granularity) or too small discards,
we zero-out the initial (and/or) trailing unaligned partial chunks,
but discard all the aligned full chunks.

At least for LVM/DM thin, the result is effectively "discard_zeroes_data=1".

Arguably it should behave this way internally, by default,
and we'll try to make that happen.

But our workaround is still valid for already deployed setups,
and for other devices that may behave this way.

Setting discard-zeroes-if-aligned=yes will allow DRBD to use
discards, and to announce discard_zeroes_data=true, even on
backends that announce discard_zeroes_data=false.

Setting discard-zeroes-if-aligned=no will cause DRBD to always
fall-back to zero-out on the receiving side, and to not even
announce discard capabilities on the Primary, if the respective
backend announces discard_zeroes_data=false.

We used to ignore the discard_zeroes_data setting completely.
To not break established and expected behaviour, and suddenly
cause fstrim on thin-provisioned LVs to run out-of-space,
instead of freeing up space, the default value is "yes".

Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2016-06-13 21:43:05 -06:00
..
aoe mm: rename _count, field of the struct page, to _refcount 2016-05-19 19:12:14 -07:00
drbd drbd: when receiving P_TRIM, zero-out partial unaligned chunks 2016-06-13 21:43:05 -06:00
mtip32xx drivers: use req op accessor 2016-06-07 13:41:38 -06:00
paride paride: make 'verbose' parameter an 'int' again 2016-03-15 16:55:16 -07:00
rsxx block, fs, mm, drivers: use bio set/get op accessors 2016-06-07 13:41:38 -06:00
xen-blkback block: add a separate operation type for secure erase 2016-06-09 09:52:25 -06:00
zram block, fs, mm, drivers: use bio set/get op accessors 2016-06-07 13:41:38 -06:00
amiflop.c block: drop owner assignment from platform_drivers 2014-10-20 16:20:18 +02:00
ataflop.c Merge branch 'for-3.16/core' of git://git.kernel.dk/linux-block into next 2014-06-02 09:29:34 -07:00
brd.c block, fs, mm, drivers: use bio set/get op accessors 2016-06-07 13:41:38 -06:00
cciss_cmd.h
cciss_scsi.c scsi: Do not set cmd_per_lun to 1 in the host template 2015-05-31 18:06:28 -07:00
cciss_scsi.h
cciss.c SCSI misc on 20160113 2016-01-13 19:37:36 -08:00
cciss.h
cryptoloop.c block: cryptoloop - Use new skcipher interface 2016-01-27 20:35:43 +08:00
DAC960.c block: use pci_zalloc_consistent 2014-08-08 15:57:28 -07:00
DAC960.h
floppy.c block, fs, mm, drivers: use bio set/get op accessors 2016-06-07 13:41:38 -06:00
hd.c block: hd: remove deprecated IRQF_DISABLED 2014-10-01 08:16:07 -06:00
Kconfig cpqarray: remove it from the kernel 2016-03-14 09:06:01 -06:00
loop.c block, drivers: add REQ_OP_FLUSH operation 2016-06-07 13:41:38 -06:00
loop.h block: loop: support DIO & AIO 2015-09-23 11:01:16 -06:00
Makefile drivers:block: cpqarray clean up 2016-03-15 15:59:47 -07:00
mg_disk.c mg_disk: fix enum REQ_OP_ kbuild error 2016-06-08 15:01:16 -06:00
nbd.c block, drivers: add REQ_OP_FLUSH operation 2016-06-07 13:41:38 -06:00
null_blk.c null_blk: add lightnvm null_blk device to the nullb_list 2016-03-18 18:10:37 -07:00
osdblk.c block, drivers: add REQ_OP_FLUSH operation 2016-06-07 13:41:38 -06:00
pktcdvd.c block, fs, mm, drivers: use bio set/get op accessors 2016-06-07 13:41:38 -06:00
ps3disk.c block, drivers: add REQ_OP_FLUSH operation 2016-06-07 13:41:38 -06:00
ps3vram.c block: change ->make_request_fn() and users to return a queue cookie 2015-11-07 10:40:46 -07:00
rbd_types.h
rbd.c drivers: use req op accessor 2016-06-07 13:41:38 -06:00
skd_main.c block, drivers: add REQ_OP_FLUSH operation 2016-06-07 13:41:38 -06:00
skd_s1120.h
smart1,2.h
sunvdc.c sunvdc: reconnect ldc after vds service domain restarts 2014-12-11 18:52:45 -08:00
swim3.c powerpc: Move Power Macintosh drivers to generic byteswappers 2015-03-23 14:29:40 +11:00
swim_asm.S
swim.c block: drop owner assignment from platform_drivers 2014-10-20 16:20:18 +02:00
sx8.c sx8: use real time for the command seconds 2015-12-23 08:42:59 -07:00
umem.c block, drivers, cgroup: use op_is_write helper instead of checking for REQ_WRITE 2016-06-07 13:41:38 -06:00
umem.h
virtio_blk.c block, drivers: add REQ_OP_FLUSH operation 2016-06-07 13:41:38 -06:00
xen-blkfront.c block: add a separate operation type for secure erase 2016-06-09 09:52:25 -06:00
xsysace.c block: systemace: Remove .owner field for driver 2014-08-21 20:37:54 -05:00
z2ram.c