linux_dsm_epyc7002/net/core
Soheil Hassas Yeganeh f5f99309fa sock: do not set sk_err in sock_dequeue_err_skb
Do not set sk_err when dequeuing errors from the error queue.
Doing so results in:
a) Bugs: By overwriting existing sk_err values, it possibly
   hides legitimate errors. It is also incorrect when local
   errors are queued with ip_local_error. That happens in the
   context of a system call, which already returns the error
   code.
b) Inconsistent behavior: When there are pending errors on
   the error queue, sk_err is sometimes 0 (e.g., for
   the first timestamp on the error queue) and sometimes
   set to an error code (after dequeuing the first
   timestamp).
c) Suboptimality: Setting sk_err to ENOMSG on simple
   TX timestamps can abort parallel reads and writes.

Removing this line doesn't break userspace. This is because
userspace code cannot rely on sk_err for detecting whether
there is something on the error queue. Except for ICMP messages
received for UDP and RAW, sk_err is not set at enqueue time,
and as a result sk_err can be 0 while there are plenty of
errors on the error queue.

For ICMP packets in UDP and RAW, sk_err is set when they are
enqueued on the error queue, but that does not result in aborting
reads and writes. For such cases, sk_err is only readable via
getsockopt(SO_ERROR) which will reset the value of sk_err on
its own. More importantly, prior to this patch,
recvmsg(MSG_ERRQUEUE) has a race on setting sk_err (i.e.,
sk_err is set by sock_dequeue_err_skb without atomic ops or
locks) which can store 0 in sk_err even when we have ICMP
messages pending. Removing this line from sock_dequeue_err_skb
eliminates that race.

Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-07 20:29:10 -05:00
..
datagram.c udp: do fwd memory scheduling on dequeue 2016-11-07 13:24:41 -05:00
dev_addr_lists.c net: fix spelling for synchronized 2014-11-18 15:26:32 -05:00
dev_ioctl.c dev_ioctl: use sizeof(x) instead of sizeof x 2014-11-18 15:27:32 -05:00
dev.c net/qdisc: IFF_NO_QUEUE drivers should use consistent TX queue len 2016-11-07 20:15:55 -05:00
devlink.c genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
drop_monitor.c genetlink: mark families as __ro_after_init 2016-10-27 16:16:09 -04:00
dst_cache.c net: dst_cache_per_cpu_dst_set() can be static 2016-03-18 17:45:08 -04:00
dst.c net: add dst_cache to ovs vxlan lwtunnel 2016-02-16 20:21:48 -05:00
ethtool.c sctp: Add GSO support 2016-06-03 19:37:21 -04:00
fib_rules.c net: core: add UID to flows, rules, and routes 2016-11-04 14:45:23 -04:00
filter.c bpf: add helper for retrieving current numa node id 2016-10-22 17:05:52 -04:00
flow_dissector.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
flow.c flowcache: Avoid OOM condition under preasure 2016-03-17 10:28:42 +01:00
gen_estimator.c net: sched: do not acquire qdisc spinlock in qdisc/class stats dump 2016-06-07 16:37:14 -07:00
gen_stats.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-06-10 11:52:24 -07:00
hwbm.c net: hwbm: Fix unbalanced spinlock in error case 2016-05-25 12:35:09 -07:00
link_watch.c dev: introduce dev_get_iflink() 2015-04-02 14:04:59 -04:00
lwtunnel.c lwtunnel: Add destroy state operation 2016-10-15 17:33:41 -04:00
Makefile net: add a hardware buffer management helper API 2016-03-14 12:19:46 -04:00
neighbour.c neigh: allow admin to set NUD_STALE 2016-08-08 15:36:38 -07:00
net_namespace.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2016-10-30 12:42:58 -04:00
net-procfs.c net: remove NETDEV_TX_LOCKED support 2016-04-26 15:53:05 -04:00
net-sysfs.c net: Add support for XPS with QoS via traffic classes 2016-10-31 15:00:48 -04:00
net-sysfs.h net: netdev_kobject_init: annotate with __init 2014-01-05 20:27:54 -05:00
net-traces.c net: IPv6 fib lookup tracepoint 2015-11-22 11:54:10 -05:00
netclassid_cgroup.c core: remove unneded headers for net cgroup controllers. 2016-02-17 15:31:27 -05:00
netevent.c netevent: remove automatic variable in register_netevent_notifier() 2015-05-31 00:03:21 -07:00
netpoll.c net: tracepoint napi:napi_poll add work and budget 2016-07-09 18:05:02 -04:00
netprio_cgroup.c core: remove unneded headers for net cgroup controllers. 2016-02-17 15:31:27 -05:00
pktgen.c net: pktgen: remove rcu locking in pktgen_change_name() 2016-10-17 10:52:59 -04:00
ptp_classifier.c ptp: Change ptp_class to a proper bitmask 2015-11-03 11:08:22 -05:00
request_sock.c tcp: restore fastopen operations 2015-10-05 03:19:06 -07:00
rtnetlink.c net: rtnl: info leak in rtnl_fill_vfinfo() 2016-10-13 12:12:04 -04:00
scm.c unix: correctly track in-flight fds in sending process user_struct 2016-02-08 10:30:42 -05:00
secure_seq.c net: remove a sparse error in secure_dccpv6_sequence_number() 2015-05-25 22:55:37 -04:00
skbuff.c sock: do not set sk_err in sock_dequeue_err_skb 2016-11-07 20:29:10 -05:00
sock_diag.c sock_diag: align nlattr properly when needed 2016-04-26 12:00:48 -04:00
sock_reuseport.c soreuseport: do not export reuseport_add_sock() 2016-10-18 14:18:23 -04:00
sock.c net: core: Add a UID field to struct sock. 2016-11-04 14:45:22 -04:00
stream.c net: do not export sk_stream_write_space 2016-09-28 20:32:38 -04:00
sysctl_net_core.c bpf: add generic constant blinding for use in jits 2016-05-16 13:49:32 -04:00
timestamping.c net: skb_defer_rx_timestamp should check for phydev before setting up classify 2015-07-09 14:17:15 -07:00
tso.c net: tso: add support for IPv6 2015-10-26 22:24:22 -07:00
utils.c net: the space is required before the open parenthesis '(' 2016-06-29 05:15:14 -04:00