linux_dsm_epyc7002/include/net
Martin KaFai Lau 33c162a980 ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update
There is a case in connected UDP socket such that
getsockopt(IPV6_MTU) will return a stale MTU value. The reproducible
sequence could be the following:
1. Create a connected UDP socket
2. Send some datagrams out
3. Receive a ICMPV6_PKT_TOOBIG
4. No new outgoing datagrams to trigger the sk_dst_check()
   logic to update the sk->sk_dst_cache.
5. getsockopt(IPV6_MTU) returns the mtu from the invalid
   sk->sk_dst_cache instead of the newly created RTF_CACHE clone.

This patch updates the sk->sk_dst_cache for a connected datagram sk
during pmtu-update code path.

Note that the sk->sk_v6_daddr is used to do the route lookup
instead of skb->data (i.e. iph).  It is because a UDP socket can become
connected after sending out some datagrams in un-connected state.  or
It can be connected multiple times to different destinations.  Hence,
iph may not be related to where sk is currently connected to.

It is done under '!sock_owned_by_user(sk)' condition because
the user may make another ip6_datagram_connect()  (i.e changing
the sk->sk_v6_daddr) while dst lookup is happening in the pmtu-update
code path.

For the sock_owned_by_user(sk) == true case, the next patch will
introduce a release_cb() which will update the sk->sk_dst_cache.

Test:

Server (Connected UDP Socket):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Route Details:
[root@arch-fb-vm1 ~]# ip -6 r show | egrep '2fac'
2fac::/64 dev eth0  proto kernel  metric 256  pref medium
2fac:face::/64 via 2fac::face dev eth0  metric 1024  pref medium

A simple python code to create a connected UDP socket:

import socket
import errno

HOST = '2fac::1'
PORT = 8080

s = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM)
s.bind((HOST, PORT))
s.connect(('2fac:face::face', 53))
print("connected")
while True:
    try:
	data = s.recv(1024)
    except socket.error as se:
	if se.errno == errno.EMSGSIZE:
		pmtu = s.getsockopt(41, 24)
		print("PMTU:%d" % pmtu)
		break
s.close()

Python program output after getting a ICMPV6_PKT_TOOBIG:
[root@arch-fb-vm1 ~]# python2 ~/devshare/kernel/tasks/fib6/udp-connect-53-8080.py
connected
PMTU:1300

Cache routes after recieving TOOBIG:
[root@arch-fb-vm1 ~]# ip -6 r show table cache
2fac:face::face via 2fac::face dev eth0  metric 0
    cache  expires 463sec mtu 1300 pref medium

Client (Send the ICMPV6_PKT_TOOBIG):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
scapy is used to generate the TOOBIG message.  Here is the scapy script I have
used:

>>> p=Ether(src='da:75:4d:36:ac:32', dst='52:54:00:12:34:66', type=0x86dd)/IPv6(src='2fac::face', dst='2fac::1')/ICMPv6PacketTooBig(mtu=1300)/IPv6(src='2fac::
1',dst='2fac:face::face', nh='UDP')/UDP(sport=8080,dport=53)
>>> sendp(p, iface='qemubr0')

Fixes: 45e4fd2668 ("ipv6: Only create RTF_CACHE routes after encountering pmtu exception")
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reported-by: Wei Wang <weiwan@google.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-14 16:29:51 -04:00
..
9p 9p: switch p9_client_read() to passing struct iov_iter * 2015-04-11 22:28:27 -04:00
bluetooth Bluetooth: Add support for limited privacy mode 2016-03-10 19:51:30 +01:00
caif caif: fix a signedness bug in cfpkt_iterate() 2015-02-20 17:35:14 -05:00
irda irda: Convert function pointer arrays and uses to const 2014-12-10 15:33:16 -05:00
iucv s390/iucv: do not use arrays as argument 2015-09-21 16:03:04 -07:00
netfilter netfilter: nft_masq: support port range 2016-03-02 20:05:27 +01:00
netns ipv6: per netns FIB garbage collection 2016-03-08 15:16:51 -05:00
nfc nfc: netlink: HCI event connectivity implementation 2015-12-29 19:06:20 +01:00
phonet sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
sctp sctp: avoid refreshing heartbeat timer too often 2016-04-10 22:22:34 -04:00
tc_act net/act_skbedit: Utility functions for mark action 2016-03-10 16:24:02 -05:00
6lowpan.h 6lowpan: iphc: add support for stateful compression 2016-02-23 20:29:40 +01:00
act_api.h net_sched: fix a memory leak in tc action 2016-04-06 16:04:29 -04:00
addrconf.h inet: refactor inet[6]_lookup functions to take skb 2016-02-11 03:54:14 -05:00
af_ieee802154.h ieee802154: af_ieee802154: fix typo in comment. 2015-09-17 13:20:05 +02:00
af_rxrpc.h
af_unix.h unix: correctly track in-flight fds in sending process user_struct 2016-02-08 10:30:42 -05:00
af_vsock.h Revert "Merge branch 'vsock-virtio'" 2015-12-08 21:55:49 -05:00
ah.h ipsec: Remove obsolete MAX_AH_AUTH_LEN 2014-09-18 10:54:36 +02:00
arp.h neigh: Factor out ___neigh_lookup_noref 2015-03-04 00:23:23 -05:00
atmclip.h
ax25.h ax25: Stop using sock->sk_protinfo. 2015-06-28 16:55:44 -07:00
ax88796.h
bond_3ad.h bonding: 3ad: apply ad_actor settings changes immediately 2016-02-09 04:45:49 -05:00
bond_alb.h net: Move bonding headers under include/net 2014-11-10 13:27:49 -05:00
bond_options.h bonding: convert num_grat_arp to the new bonding option API 2015-07-27 01:05:24 -07:00
bonding.h bonding: fix bond_get_stats() 2016-03-18 23:14:15 -04:00
busy_poll.h net: un-inline sk_busy_loop() 2015-11-18 16:17:38 -05:00
cfg80211-wext.h
cfg80211.h cfg80211: basic support for PBSS network type 2016-02-24 09:04:34 +01:00
cfg802154.h nl802154: add support for security layer 2015-09-30 13:16:44 +02:00
checksum.h csum: Update csum_block_add to use rotate instead of byteswap 2016-03-13 15:01:00 -04:00
cipso_ipv4.h cipso: don't use IPCB() to locate the CIPSO IP option 2015-02-11 14:46:37 -05:00
cls_cgroup.h net: wrap sock->sk_cgrp_prioidx and ->sk_classid inside a struct 2015-12-08 22:02:33 -05:00
codel.h net_sched: update hierarchical backlog too 2016-02-29 17:02:33 -05:00
compat.h net: switch importing msghdr from userland to {compat_,}import_iovec() 2015-04-09 00:02:26 -04:00
datalink.h
dcbevent.h
dcbnl.h net/dcb: Add IEEE QCN attribute 2015-03-06 21:50:02 -05:00
devlink.h Introduce devlink infrastructure 2016-03-01 16:07:29 -05:00
dn_dev.h
dn_fib.h
dn_neigh.h netfilter: Pass net into okfn 2015-09-17 17:18:37 -07:00
dn_nsp.h
dn_route.h
dn.h
dsa.h net: dsa: make port_bridge_leave return void 2016-03-14 16:05:31 -04:00
dsfield.h
dst_cache.h net: add dst_cache support 2016-02-16 20:21:48 -05:00
dst_metadata.h ip_tunnel: add support for setting flow label via collect metadata 2016-03-11 15:14:26 -05:00
dst_ops.h ipv4, ipv6: Pass net into __ip_local_out and __ip6_local_out 2015-10-08 04:27:02 -07:00
dst.h bpf, dst: add and use dst_tclassid helper 2016-03-18 19:38:46 -04:00
esp.h
ethoc.h net/ethoc: support big-endian register layout 2015-09-23 15:33:15 -07:00
fib_rules.h net: ipv6: use common fib_default_rule_pref 2015-09-09 14:19:50 -07:00
firewire.h
flow_dissector.h net/flow_dissector: Make dissector_uses_key() and skb_flow_dissector_target() public 2016-03-10 16:24:02 -05:00
flow.h ipv6, trace: fix tos reporting on fib6_table_lookup 2016-03-20 13:44:34 -04:00
flowcache.h
fou.h ip_tunnel: Ops registration for secondary encap (fou, gue) 2014-11-12 15:01:35 -05:00
garp.h
gen_stats.h net: sched: enable per cpu qstats 2014-09-30 01:02:26 -04:00
genetlink.h Revert "genl: Add genlmsg_new_unicast() for unicast message allocation" 2016-02-18 11:42:19 -05:00
geneve.h geneve: Add geneve_get_rx_port support 2015-12-16 10:58:56 -05:00
gre.h gre: Remove support for sharing GRE protocol hook. 2015-08-10 14:03:54 -07:00
gro_cells.h gro_cells: remove spinlock protecting receive queues 2015-08-31 15:17:17 -07:00
gue.h gue: Protocol constants for remote checksum offload 2014-11-05 16:30:03 -05:00
hwbm.h net: add a hardware buffer management helper API 2016-03-14 12:19:46 -04:00
icmp.h
ieee80211_radiotap.h
ieee802154_netdev.h mac802154: constify ieee802154_llsec_ops structure 2016-01-04 20:40:41 +01:00
if_inet6.h ipv6: do retries on stable privacy addresses 2015-03-23 22:12:09 -04:00
ila.h ila: Add generic ILA translation facility 2015-12-15 23:25:20 -05:00
inet6_connection_sock.h ipv6: remove unused in6_addr struct 2016-03-22 15:45:44 -04:00
inet6_hashtables.h inet: refactor inet[6]_lookup functions to take skb 2016-02-11 03:54:14 -05:00
inet_common.h net: avoid NULL deref in inet_ctl_sock_destroy() 2015-11-02 22:46:09 -05:00
inet_connection_sock.h tcp/dccp: fix another race at listener dismantle 2016-02-18 11:35:51 -05:00
inet_ecn.h ipv6: update skb->csum when CE mark is propagated 2016-01-15 15:07:23 -05:00
inet_frag.h ipv4: namespacify ip fragment max dist sysctl knob 2016-02-16 20:42:54 -05:00
inet_hashtables.h soreuseport: fast reuseport TCP socket selection 2016-02-11 03:54:15 -05:00
inet_sock.h net: Allow accepted sockets to be bound to l3mdev domain 2015-12-18 14:43:38 -05:00
inet_timewait_sock.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2015-10-20 06:08:27 -07:00
inetpeer.h inet: tcp: fix inetpeer_set_addr_v4() 2015-12-16 00:14:12 -05:00
ip6_checksum.h ipv6: Pass proto to csum_ipv6_magic as __u8 instead of unsigned short 2016-03-13 23:55:13 -04:00
ip6_fib.h ipv6: Check rt->dst.from for the DST_NOCACHE route 2015-11-15 17:12:37 -05:00
ip6_route.h net: vrf: Fix dst reference counting 2016-04-11 15:56:20 -04:00
ip6_tunnel.h net: replace dst_cache ip6_tunnel implementation with the generic one 2016-02-16 20:21:48 -05:00
ip_fib.h route: check and remove route cache when we get route 2016-02-18 11:31:36 -05:00
ip_tunnels.h tunnels: Remove encapsulation offloads on decap. 2016-03-20 16:33:40 -04:00
ip_vs.h ipvs: drop first packet to redirect conntrack 2016-03-07 11:53:30 +09:00
ip.h net: ipv4: Convert IP network timestamps to be y2038 safe 2016-03-01 17:18:44 -05:00
ipcomp.h
ipconfig.h
ipv6.h ipv6: datagram: Update dst cache of a connected datagram sk during pmtu update 2016-04-14 16:29:51 -04:00
ipx.h switch ipxrtr_route_packet() from iovec to msghdr 2014-11-24 04:28:49 -05:00
iw_handler.h cfg80211/wext: fix message ordering 2016-01-29 17:13:43 +01:00
kcm.h kcm: mark helper functions inline 2016-03-10 14:42:03 -05:00
l3mdev.h net: l3mdev: address selection should only consider devices in L3 domain 2016-02-26 14:22:26 -05:00
lapb.h
lib80211.h lib80211: remove unused print_ssid() 2014-10-14 02:18:27 +02:00
llc_c_ac.h
llc_c_ev.h
llc_c_st.h llc: Make llc_conn_ev_qfyr_t function pointer arrays const 2014-12-10 15:21:24 -05:00
llc_conn.h net: Pass kern from net_proto_family.create to sk_alloc 2015-05-11 10:50:17 -04:00
llc_if.h
llc_pdu.h
llc_s_ac.h
llc_s_ev.h
llc_s_st.h llc: Make llc_sap_action_t function pointer arrays const 2014-12-10 15:21:24 -05:00
llc_sap.h
llc.h
lwtunnel.h lwtunnel: autoload of lwt modules 2016-02-21 22:00:28 -05:00
mac80211.h mac80211: add doc for RX_FLAG_DUP_VALIDATED flag 2016-04-05 11:10:59 +02:00
mac802154.h mac802154: use put and get unaligned functions 2016-03-10 19:51:28 +01:00
mip6.h
mld.h ipv6: mld: answer mldv2 queries with mldv1 reports in mldv1 fallback 2014-09-22 16:23:15 -04:00
mpls_iptunnel.h mpls: multipath route support 2015-10-23 06:26:42 -07:00
mpls.h openvswitch: Add basic MPLS support to kernel 2014-11-05 23:52:33 -08:00
mrp.h
ndisc.h Revert "ipv6: ndisc: inherit metadata dst when creating ndisc requests" 2015-12-01 15:07:59 -05:00
neighbour.h net: add explicit logging and stat for neighbour table overflow 2015-08-10 13:46:21 -07:00
net_namespace.h netfilter: cttimeout: add netns support 2015-12-14 12:48:58 +01:00
net_ratelimit.h
netevent.h
netlabel.h netlabel: fix the netlbl_catmap_setlong() dummy function 2014-08-07 20:55:21 -04:00
netlink.h netlink: add nla_get for le32 and le64 2015-09-30 13:16:44 +02:00
netprio_cgroup.h net: wrap sock->sk_cgrp_prioidx and ->sk_classid inside a struct 2015-12-08 22:02:33 -05:00
netrom.h
nexthop.h
nl802154.h nl802154: add support for security layer 2015-09-30 13:16:44 +02:00
p8022.h
ping.h net: ping: make ping_v6_sendmsg static 2016-03-23 22:09:58 -04:00
pkt_cls.h net/flower: Fix pointer cast 2016-03-11 12:04:37 -05:00
pkt_sched.h net: sched: consolidate tc_classify{,_compat} 2015-08-27 14:18:48 -07:00
protocol.h udp: restrict offloads to one namespace 2016-01-10 17:28:24 -05:00
psnap.h
raw.h sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
rawv6.h
red.h
regulatory.h cfg80211: allow wiphy specific regdomain management 2014-12-17 11:49:55 +01:00
request_sock.h net: add inet_sk_transparent() helper 2015-12-22 17:03:05 -05:00
rose.h
route.h net: vrf: Fix dst reference counting 2016-04-11 15:56:20 -04:00
rtnetlink.h netlink: Rightsize IFLA_AF_SPEC size calculation 2015-10-21 19:15:20 -07:00
sch_generic.h net: sched: use pfifo_fast for non real queues 2016-03-03 17:38:46 -05:00
scm.h unix: correctly track in-flight fds in sending process user_struct 2016-02-08 10:30:42 -05:00
secure_seq.h inetpeer: get rid of ip_id_count 2014-06-02 11:00:41 -07:00
slhc_vj.h
snmp.h Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu 2014-10-15 07:48:18 +02:00
sock_reuseport.h soreuseport: fix NULL ptr dereference SO_REUSEPORT after bind 2016-01-19 14:44:23 -05:00
sock.h sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
Space.h
stp.h
switchdev.h switchdev: Adding MDB entry offload 2016-01-10 16:50:20 -05:00
tcp_states.h inet: add TCP_NEW_SYN_RECV state 2015-03-12 22:58:12 -04:00
tcp.h Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2016-03-19 10:05:34 -07:00
timewait_sock.h inet: remove BUG_ON() in twsk_destructor() 2015-07-09 15:12:20 -07:00
transp_v6.h
tso.h net: tso: add support for IPv6 2015-10-26 22:24:22 -07:00
udp_tunnel.h ip_tunnel: add support for setting flow label via collect metadata 2016-03-11 15:14:26 -05:00
udp.h sock: struct proto hash function may error 2016-02-11 03:54:14 -05:00
udplite.h net: switch memcpy_fromiovec()/memcpy_fromiovecend() users to copy_from_iter() 2015-02-04 01:34:15 -05:00
vsock_addr.h
vxlan.h vxlan: fix sparse warnings 2016-03-21 13:30:02 -04:00
wext.h
wimax.h net: treewide: Fix typo found in DocBook/networking.xml 2014-09-05 17:35:28 -07:00
x25.h
x25device.h
xfrm.h xfrm: add rcu protection to sk->sk_policy[] 2015-12-11 19:22:06 -05:00