linux_dsm_epyc7002/net
Ilya Dryomov 930c532869 libceph: apply new_state before new_up_client on incrementals
Currently, osd_weight and osd_state fields are updated in the encoding
order.  This is wrong, because an incremental map may look like e.g.

    new_up_client: { osd=6, addr=... } # set osd_state and addr
    new_state: { osd=6, xorstate=EXISTS } # clear osd_state

Suppose osd6's current osd_state is EXISTS (i.e. osd6 is down).  After
applying new_up_client, osd_state is changed to EXISTS | UP.  Carrying
on with the new_state update, we flip EXISTS and leave osd6 in a weird
"!EXISTS but UP" state.  A non-existent OSD is considered down by the
mapping code

2087    for (i = 0; i < pg->pg_temp.len; i++) {
2088            if (ceph_osd_is_down(osdmap, pg->pg_temp.osds[i])) {
2089                    if (ceph_can_shift_osds(pi))
2090                            continue;
2091
2092                    temp->osds[temp->size++] = CRUSH_ITEM_NONE;

and so requests get directed to the second OSD in the set instead of
the first, resulting in OSD-side errors like:

[WRN] : client.4239 192.168.122.21:0/2444980242 misdirected client.4239.1:2827 pg 2.5df899f2 to osd.4 not [1,4,6] in e680/680

and hung rbds on the client:

[  493.566367] rbd: rbd0: write 400000 at 11cc00000 (0)
[  493.566805] rbd: rbd0:   result -6 xferred 400000
[  493.567011] blk_update_request: I/O error, dev rbd0, sector 9330688

The fix is to decouple application from the decoding and:
- apply new_weight first
- apply new_state before new_up_client
- twiddle osd_state flags if marking in
- clear out some of the state if osd is destroyed

Fixes: http://tracker.ceph.com/issues/14901

Cc: stable@vger.kernel.org # 3.15+: 6dd74e44dc: libceph: set 'exists' flag for newly up osd
Cc: stable@vger.kernel.org # 3.15+
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
2016-07-22 15:17:40 +02:00
..
6lowpan 6lowpan: move mac802154 header 2016-04-13 10:41:10 +02:00
9p remove lots of IS_ERR_VALUE abuses 2016-05-27 15:26:11 -07:00
802
8021q vlan: Propagate MAC address to VLANs 2016-05-31 11:56:48 -07:00
appletalk appletalk: fix erroneous return value 2016-02-18 14:59:34 -05:00
atm net/atm: sk_err_soft must be positive 2016-05-23 13:51:10 -07:00
ax25 AX.25: Close socket connection on session completion 2016-06-18 20:55:34 -07:00
batman-adv batman-adv: Clean up untagged vlan when destroying via rtnl-link 2016-06-29 04:01:48 -04:00
bluetooth Bluetooth: fix power_on vs close race 2016-05-13 16:50:23 +02:00
bridge ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output 2016-06-30 09:02:48 -04:00
caif net: caif: fix misleading indentation 2016-03-14 13:09:50 -04:00
can sock: enable timestamping using control messages 2016-04-04 15:50:30 -04:00
ceph libceph: apply new_state before new_up_client on incrementals 2016-07-22 15:17:40 +02:00
core net_sched: fix mirrored packets checksum 2016-07-01 16:19:34 -04:00
dcb net/dcb: make dcbnl.c explicitly non-modular 2015-10-09 07:52:27 -07:00
dccp dccp: do not assume DCCP code is non preemptible 2016-05-02 17:02:25 -04:00
decnet net: fix decnet rtnexthop parsing 2016-07-05 14:08:47 -07:00
dns_resolver KEYS: Add a facility to restrict new links into a keyring 2016-04-11 22:37:37 +01:00
dsa dsa: Rename switch chip data to cd 2016-05-11 19:36:28 -04:00
ethernet eth: Pull header from first fragment via eth_get_headlen 2016-02-24 13:58:05 -05:00
hsr net/hsr: Use setup_timer and mod_timer. 2016-05-16 14:00:43 -04:00
ieee802154 ieee802154: fix logic error in ieee802154_llsec_parse_dev_addr 2016-05-29 22:36:25 -07:00
ipv4 ipv4: Fix ip_skb_dst_mtu to use the sk passed by ip_finish_output 2016-06-30 09:02:48 -04:00
ipv6 ipv6: Fix mem leak in rt6i_pcpu 2016-07-05 14:09:23 -07:00
ipx
irda TTY and Serial driver update for 4.7-rc1 2016-05-20 20:57:27 -07:00
iucv af_iucv: Validate socket address length in iucv_sock_bind() 2016-01-19 14:21:08 -05:00
kcm kcm: fix /proc memory leak 2016-06-22 16:32:23 -04:00
key af_key: fix two typos 2015-10-23 03:05:19 -07:00
l2tp l2tp: fix configuration passed to setup_udp_tunnel_sock() 2016-06-08 11:11:53 -07:00
l3mdev net: l3mdev: Allow send on enslaved interface 2016-05-09 22:33:52 -04:00
lapb net/lapb: tuse %*ph to dump buffers 2016-05-29 22:33:25 -07:00
llc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-09 15:59:24 -04:00
mac80211 mac80211: Fix mesh estab_plinks counting in STA removal case 2016-06-28 12:39:50 +02:00
mac802154 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
mpls gso: Remove arbitrary checks for unsupported GSO 2016-05-20 18:03:15 -04:00
netfilter netfilter: nf_tables: fix a wrong check to skip the inactive rules 2016-06-15 12:17:24 +02:00
netlabel netlabel: fix a problem with netlbl_secattr_catmap_setrng() 2016-04-05 16:10:47 -04:00
netlink netlink: Fix dump skb leak/double free 2016-05-16 22:05:15 -04:00
netrom
nfc nfc: nci: Add nci_nfcc_loopback to the nci core 2016-05-04 01:48:16 +02:00
openvswitch openvswitch: fix conntrack netlink event delivery 2016-06-29 08:13:59 -04:00
packet packet: Use symmetric hash for PACKET_FANOUT_HASH. 2016-07-01 16:07:50 -04:00
phonet sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
qrtr Merge tag 'qcom-soc-for-4.7-2' into net-next 2016-05-17 14:11:19 -04:00
rds RDS: fix rds_tcp_init() error path 2016-07-04 16:09:49 -07:00
rfkill rfkill: Use switch to demux userspace operations 2016-04-05 10:48:53 +02:00
rose
rxrpc rxrpc: fix ptr_ret.cocci warnings 2016-06-07 15:30:21 -07:00
sched net_sched: fix mirrored packets checksum 2016-07-01 16:19:34 -04:00
sctp net: diag: add missing declarations 2016-06-10 23:22:55 -07:00
sunrpc rpc: share one xps between all backchannels 2016-06-15 10:32:25 -04:00
switchdev switchdev: pass pointer to fib_info instead of copy 2016-05-17 13:58:49 -04:00
tipc tipc: fix nl compat regression for link statistics 2016-07-01 16:47:38 -04:00
unix Merge branch 'overlayfs-af_unix-fix' into overlayfs-linus 2016-06-12 12:05:21 +02:00
vmw_vsock vsock: make listener child lock ordering explicit 2016-06-27 10:44:46 -04:00
wimax
wireless cfg80211: fix proto in ieee80211_data_to_8023 for frames without LLC header 2016-06-29 11:50:33 +02:00
x25 net: fix a kernel infoleak in x25 module 2016-05-09 22:45:33 -04:00
xfrm Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-05-09 15:59:24 -04:00
compat.c packet: compat support for sock_fprog 2016-06-09 23:41:03 -07:00
Kconfig bpf: add generic constant blinding for use in jits 2016-05-16 13:49:32 -04:00
Makefile net: Add Qualcomm IPC router 2016-05-08 23:46:14 -04:00
socket.c fs: poll/select/recvmmsg: use timespec64 for timeout events 2016-05-19 19:12:14 -07:00
sysctl_net.c net: sysctl: fix a kmemleak warning 2015-10-23 06:22:08 -07:00