GRE assumes that the GRE header is at skb_network_header +
ip_hrdlen(skb). It is more general to use skb_transport_header
and this allows the possbility of inserting additional header
between IP and GRE (which is what we will done in Generic UDP
Encapsulation for GRE).
Signed-off-by: Tom Herbert <therbert@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
pull request: wireless 2014-09-05
Please pull this batch of fixes intended for the 3.17 stream...
For the mac80211 bits, Johannes says:
"Here are a few fixes for mac80211. One has been discussed for a while
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
shouldn't be necessary but since many places treat it as a string we
couldn't move to just sending two bytes.
In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
fix for the recently introduced RX aggregation offload, a revert for
a broken patch (that luckily didn't really cause any harm) and a small
fix for alignment in debugfs."
For the iwlwifi bits, Emmanuel says:
"I revert a patch that disabled CTS to self in dvm because users
reported issues. The revert is CCed to stable since the offending
patch was sent to stable too. I also bump the firmware API versions
since a new firmware is coming up. On top of that, Marcel fixes a
bug I introduced while fixing a bug in our Kconfig file."
Please let me know if there are problems!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
It is possible that the interface is already gone after joining
the list of anycast on this interface as we don't hold a refcount
for the device, in this case we are safe to ignore the error.
What's more important, for API compatibility we should not
change this behavior for applications even if it were correct.
Fixes: commit a9ed4a2986 ("ipv6: fix rtnl locking in setsockopt for anycast and multicast")
Cc: Sabrina Dubroca <sd@queasysnail.net>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of dereference each byte let's use %*ph specifier in the printk()
calls.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The TCP_SKB_CB(skb)->when field no longer exists as of recent change
7faee5c0d5 ("tcp: remove TCP_SKB_CB(skb)->when"). And in any case,
tcp_fragment() is called on already-transmitted packets from the
__tcp_retransmit_skb() call site, so copying timestamps of any kind
in this spot is quite sensible.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Reported-by: Yuchung Cheng <ycheng@google.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
After commit 740b0f1841 ("tcp: switch rtt estimations to usec resolution"),
we no longer need to maintain timestamps in two different fields.
TCP_SKB_CB(skb)->when can be removed, as same information sits in skb_mstamp.stamp_jiffies
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TCP_SKB_CB(skb)->when has different meaning in output and input paths.
In output path, it contains a timestamp.
In input path, it contains an ISN, chosen by tcp_timewait_state_process()
Lets add a different name to ease code comprehension.
Note that 'when' field will disappear in following patch,
as skb_mstamp already contains timestamp, the anonymous
union will promptly disappear as well.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch updates some of the flow_dissector api so that it can be used to
parse the length of ethernet buffers stored in fragments. Most of the
changes needed were to __skb_get_poff as it needed to be updated to support
sending a linear buffer instead of a skb.
I have split __skb_get_poff into two functions, the first is skb_get_poff
and it retains the functionality of the original __skb_get_poff. The other
function is __skb_get_poff which now works much like __skb_flow_dissect in
relation to skb_flow_dissect in that it provides the same functionality but
works with just a data buffer and hlen instead of needing an skb.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since sock_efree and sock_demux are essentially the same code for non-TCP
sockets and the case where CONFIG_INET is not defined we can combine the
code or replace the call to sock_edemux in several spots. As a result we
can avoid a bit of unnecessary code or code duplication.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The phy timestamping takes a different path than the regular timestamping
does in that it will create a clone first so that the packets needing to be
timestamped can be placed in a queue, or the context block could be used.
In order to support these use cases I am pulling the core of the code out
so it can be used in other drivers beyond just phy devices.
In addition I have added a destructor named sock_efree which is meant to
provide a simple way for dropping the reference to skb exceptions that
aren't part of either the receive or send windows for the socket, and I
have removed some duplication in spots where this destructor could be used
in place of sock_edemux.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change merges the shared bits that exist between skb_tx_tstamp and
skb_complete_tx_timestamp. By doing this we can avoid the two diverging as
there were already changes pushed into skb_tx_tstamp that hadn't made it
into the other function.
In addition this resolves issues with the fact that
skb_complete_tx_timestamp was included in linux/skbuff.h even though it was
only compiled in if phy timestamping was enabled.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lets make this hash function a bit secure, as ICMP attacks are still
in the wild.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fix spelling typo found in DocBook/networking.xml.
It is because the neworking.xml is generated from comments
in the source, I have to fix typo in comments within the source.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Paul Bolle reports that 'select NETFILTER_XT_NAT' from the IPV4 and IPV6
NAT tables becomes noop since there is no Kconfig switch for it. Add the
Kconfig switch to resolve this problem.
Fixes: 8993cf8 netfilter: move NAT Kconfig switches out of the iptables scope
Reported-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
nh_exceptions is effectively used under rcu, but lacks proper
barriers. Between kzalloc() and setting of nh->nh_exceptions(),
we need a proper memory barrier.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Fixes: 4895c771c7 ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: David S. Miller <davem@davemloft.net>
addrconf_get_prefix_route() ensures to get the right route in the right table.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no reason to take a refcnt before deleting the peer address route.
It's done some lines below for the local prefix route because
inet6_ifa_finish_destroy() will release it at the end.
For the peer address route, we want to free it right now.
This bug has been introduced by commit
caeaba7900 ("ipv6: add support of peer address").
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This syntax error was covered by L2TP_REFCNT_DEBUG not being set by
default.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The timestamping API has separate bits for generating and reporting
timestamps. A software timestamp should only be reported for a packet
when the packet has the relevant generation flag (SKBTX_..) set
and the socket has reporting bit SOF_TIMESTAMPING_SOFTWARE set.
The second check was accidentally removed. Reinstitute the original
behavior.
Tested:
Without this patch, Documentation/networking/txtimestamp reports
timestamps regardless of whether SOF_TIMESTAMPING_SOFTWARE is set.
After the patch, it only reports them when the flag is set.
Fixes: f24b9be595 ("net-timestamp: extend SCM_TIMESTAMPING ancillary data struct")
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds new ethtool cmd, ETHTOOL_GTUNABLE & ETHTOOL_STUNABLE for getting
tunable values from driver.
Add get_tunable and set_tunable to ethtool_ops. Driver implements these
functions for getting/setting tunable value.
Signed-off-by: Govindarajulu Varadarajan <_govind@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcel reported to see the following message when autoloading
is being triggered when adding nlmon device:
Loading kernel module for a network device with
CAP_SYS_MODULE (deprecated). Use CAP_NET_ADMIN and alias
netdev-nlmon instead.
This false-positive happens despite with having correct
capabilities set, e.g. through issuing `ip link del dev nlmon`
more than once on a valid device with name nlmon, but Marcel
has also seen it on creation time when no nlmon module is
previously compiled-in or loaded as module and the device
name equals a link type name (e.g. nlmon, vxlan, team).
Stephen says:
The netdev module alias is a hold over from the past. For
normal devices, people used to create a alias eth0 to and
point it to the type of network device used, that was back
in the bad old ISA days before real discovery.
Also, the tunnels create module alias for the control device
and ip used to use this to autoload the tunnel device.
The message is bogus and should just be removed, I also see
it in a couple of other cases where tap devices are renamed
for other usese.
As mentioned in 8909c9ad8f ("net: don't allow CAP_NET_ADMIN
to load non-netdev kernel modules"), we nevertheless still
might want to leave the old autoloading behaviour in place
as it could break old scripts, so for now, lets just remove
the log message as Stephen suggests.
Reference: http://thread.gmane.org/gmane.linux.kernel/1105168
Reported-by: Marcel Holtmann <marcel@holtmann.org>
Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Cc: Vasiliy Kulikov <segoon@openwall.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
With eBPF getting more extended and exposure to user space is on it's way,
hardening the memory range the interpreter uses to steer its command flow
seems appropriate. This patch moves the to be interpreted bytecode to
read-only pages.
In case we execute a corrupted BPF interpreter image for some reason e.g.
caused by an attacker which got past a verifier stage, it would not only
provide arbitrary read/write memory access but arbitrary function calls
as well. After setting up the BPF interpreter image, its contents do not
change until destruction time, thus we can setup the image on immutable
made pages in order to mitigate modifications to that code. The idea
is derived from commit 314beb9bca ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks").
This is possible because bpf_prog is not part of sk_filter anymore.
After setup bpf_prog cannot be altered during its life-time. This prevents
any modifications to the entire bpf_prog structure (incl. function/JIT
image pointer).
Every eBPF program (including classic BPF that are migrated) have to call
bpf_prog_select_runtime() to select either interpreter or a JIT image
as a last setup step, and they all are being freed via bpf_prog_free(),
including non-JIT. Therefore, we can easily integrate this into the
eBPF life-time, plus since we directly allocate a bpf_prog, we have no
performance penalty.
Tested with seccomp and test_bpf testsuite in JIT/non-JIT mode and manual
inspection of kernel_page_tables. Brad Spengler proposed the same idea
via Twitter during development of this patch.
Joint work with Hannes Frederic Sowa.
Suggested-by: Brad Spengler <spender@grsecurity.net>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Kees Cook <keescook@chromium.org>
Acked-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Calling setsockopt with IPV6_JOIN_ANYCAST or IPV6_LEAVE_ANYCAST
triggers the assertion in addrconf_join_solict()/addrconf_leave_solict()
ipv6_sock_ac_join(), ipv6_sock_ac_drop(), ipv6_sock_ac_close() need to
take RTNL before calling ipv6_dev_ac_inc/dec. Same thing with
ipv6_sock_mc_join(), ipv6_sock_mc_drop(), ipv6_sock_mc_close() before
calling ipv6_dev_mc_inc/dec.
This patch moves ASSERT_RTNL() up a level in the call stack.
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As in IPv6 people might increase the igmp query robustness variable to
make sure unsolicited state change reports aren't lost on the network. Add
and document this new knob to igmp code.
RFCs allow tuning this parameter back to first IGMP RFC, so we also use
this setting for all counters, including source specific multicast.
Also take over sysctl value when upping the interface and don't reuse
the last one seen on the interface.
Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new sysctl_mld_qrv knob to configure the mldv1/v2 query
robustness variable. It specifies how many retransmit of unsolicited mld
retransmit should happen. Admins might want to tune this on lossy links.
Also reset mld state on interface down/up, so we pick up new sysctl
settings during interface up event.
IPv6 certification requests this knob to be available.
I didn't make this knob netns specific, as it is mostly a setting in a
physical environment and should be per host.
Cc: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Acked-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
shouldn't be necessary but since many places treat it as a string we
couldn't move to just sending two bytes.
In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
fix for the recently introduced RX aggregation offload, a revert for
a broken patch (that luckily didn't really cause any harm) and a small
fix for alignment in debugfs.
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJUAFrmAAoJEDBSmw7B7bqr79AP/1yGi9lkv/wWUs5y0AhUSen9
850MU26BBlyAAFSz11xqgaEeRmeBeqhR3K7w/M02TX0CHxBzMqMZfyE//tq0UJaI
ZwZmtyQmdMiOSNKignTIIx7OHTioq0wrGKb6O2UvKoJfTlB9t01jCC4jmCTF5Vos
6ReF7NaZEbxW6XDOsClNTAtIa1c6n1RQ5VbDIEL5Vfvqv8LbcobduF8WcYl80eIQ
+EvIHtUm/Luxg6DblibgEVtwYOtNpvRz4pofdw3xoSHAnF+zhXbUr0dUjpkBNA7o
vWboCBl14Qn1M7pOJZ0+TBzFmquAr6CDbDvArVCH01Swh27EUDQUcHQAggGpT71w
DFgWHOYP0UCB6Y4U0GjBehy8PeuytqJLBSceKVud7DDqd8fY+Lq3MMyicIk0aw3o
IIDLWrujkCBXsdfuxQETmYxHU05WHSuYOCTgGSqbq3QPTWm8pBGWTdbk+1t/0FyH
cGLJOWs/jCrtHdzDj6TH+kL8NmvwB7sC9MT45qG0ilevmPW25yrnTJPEMEvFBqvZ
lnaqiX6D1kGNZd09CxgSIhxrQi+N0Yg+UlLa4IUtOIqnQussOC3xH2U5qTufdpa1
Gi9aCkBGVKQiObPWucf2QB4t1sZ18rxBhrAelZhQPLTKrnsuLhpcVBlU+L6ScCAk
FVni4HZH2IGtDQ577k10
=G/pB
-----END PGP SIGNATURE-----
Merge tag 'mac80211-for-john-2014-08-29' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211
Johannes Berg <johannes@sipsolutions.net> says:
"Here are a few fixes for mac80211. One has been discussed for a while
and adds a terminating NUL-byte to the alpha2 sent to userspace, which
shouldn't be necessary but since many places treat it as a string we
couldn't move to just sending two bytes.
In addition to that, we have two VLAN fixes from Felix, a mesh fix, a
fix for the recently introduced RX aggregation offload, a revert for
a broken patch (that luckily didn't really cause any harm) and a small
fix for alignment in debugfs."
Signed-off-by: John W. Linville <linville@redhat.com>
distinguish between the dropped and consumed skb, not assume the skb
is consumed always
Cc: Thomas Graf <tgraf@noironetworks.com>
Cc: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit 50cbe9ab5f ("net: Validate xmit SKBs right when we
pull them out of the qdisc") the validation code was moved out of
dev_hard_start_xmit and into dequeue_skb.
However this overlooked the fact that we do not always enqueue
the skb onto a qdisc. First situation is if qdisc have flag
TCQ_F_CAN_BYPASS and qdisc is empty. Second situation is if
there is no qdisc on the device, which is a common case for
software devices.
Originally spotted and inital patch by Alexander Duyck.
As a result Alex was seeing issues trying to connect to a
vhost_net interface after commit 50cbe9ab5f was applied.
Added a call to validate_xmit_skb() in __dev_xmit_skb(), in the
code path for qdiscs with TCQ_F_CAN_BYPASS flag, and in
__dev_queue_xmit() when no qdisc.
Also handle the error situation where dev_hard_start_xmit() could
return a skb list, and does not return dev_xmit_complete(rc) and
falls through to the kfree_skb(), in that situation it should
call kfree_skb_list().
Fixes: 50cbe9ab5f ("net: Validate xmit SKBs right when we pull them out of the qdisc")
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
More minor fixes to merge commit 53fda7f7f9 (Merge branch 'xmit_list')
that allows us to work with a list of SKBs.
Fixing exit cases in qdisc_reset() and qdisc_destroy(), where a
leftover requeued SKB (qdisc->gso_skb) can have the potential of
being a skb list, thus use kfree_skb_list().
This is a followup to commit 10770bc2d1 ("qdisc: adjustments for
API allowing skb list xmits").
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Minor adjustments for merge commit 53fda7f7f9 (Merge branch 'xmit_list')
that allows us to work with a list of SKBs.
Update code doc to function sch_direct_xmit().
In handle_dev_cpu_collision() use kfree_skb_list() in error handling.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The user_skb maybe be leaked if the operation on it failed and codes
skipped into the label "out:" without calling genlmsg_unicast.
Cc: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
make defconfig reports:
warning: (NETFILTER_XT_TARGET_LOG) selects NF_LOG_IPV6 which has unmet direct dependencies (NET && INET && IPV6 && NETFILTER && NETFILTER_ADVANCED)
Fixes: d79a61d netfilter: NETFILTER_XT_TARGET_LOG selects NF_LOG_*
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
pull request: Netfilter/IPVS fixes for net
The following patchset contains seven Netfilter fixes for your net
tree, they are:
1) Make the NAT infrastructure independent of x_tables, some users are
already starting to test nf_tables with NAT without enabling x_tables.
Without this patch for Kconfig, there's a superfluous dependency
between NAT and x_tables.
2) Allow to use 0 in the cgroup match, the kernel rejects with -EINVAL
with no good reason. From Daniel Borkmann.
3) Select CONFIG_NF_NAT from the nf_tables NAT expression, this also
resolves another NAT dependency with x_tables.
4) Use HAVE_JUMP_LABEL instead of CONFIG_JUMP_LABEL in the Netfilter hook
code as elsewhere in the kernel to resolve toolchain problems, from
Zhouyi Zhou.
5) Use iptunnel_handle_offloads() to set up tunnel encapsulation
depending on the offload capabilities, reported by Alex Gartrell
patch from Julian Anastasov.
6) Fix wrong family when registering the ip_vs_local_reply6() hook,
also from Julian.
7) Select the NF_LOG_* symbols from NETFILTER_XT_TARGET_LOG. Rafał
Miłecki reported that when jumping from 3.16 to 3.17-rc, his log
target is not selected anymore due to changes in the previous
development cycle to accomodate the full logging support for
nf_tables.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Depending on which parameters were updated, the changes were not propagated via
the notifier chain and netlink.
The new flag has been set only when the change did not cause a call to the
notifier chain and/or to the netlink notification functions.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There is no functional changes with this commit, it only prepares the next one.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only effect of this patch is to print a warning if IFLA_LINKMODE is updated
and a following change fails.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The only effect of this patch is to print a warning if IFLA_TXQLEN is updated
and a following change fails.
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
sk->sk_error_queue is dequeued in four locations. All share the
exact same logic. Deduplicate.
Also collapse the two critical sections for dequeue (at the top of
the recv handler) and signal (at the bottom).
This moves signal generation for the next packet forward, which should
be harmless.
It also changes the behavior if the recv handler exits early with an
error. Previously, a signal for follow-up packets on the errqueue
would then not be scheduled. The new behavior, to always signal, is
arguably a bug fix.
For rxrpc, the change causes the same function to be called repeatedly
for each queued packet (because the recv handler == sk_error_report).
It is likely that all packets will fail for the same reason (e.g.,
memory exhaustion).
This code runs without sk_lock held, so it is not safe to trust that
sk->sk_err is immutable inbetween releasing q->lock and the subsequent
test. Introduce int err just to avoid this potential race.
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call skb_checksum_try_convert and skb_gro_checksum_try_convert
after checksum is found present and validated in the GRE header
for normal and GRO paths respectively.
In GRO path, call skb_gro_checksum_try_convert
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for doing CHECKSUM_UNNECESSARY to CHECKSUM_COMPLETE
conversion in UDP tunneling path.
In the normal UDP path, we call skb_checksum_try_convert after locating
the UDP socket. The check is that checksum conversion is enabled for
the socket (new flag in UDP socket) and that checksum field is
non-zero.
In the UDP GRO path, we call skb_gro_checksum_try_convert after
checksum is validated and checksum field is non-zero. Since this is
already in GRO we assume that checksum conversion is always wanted.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This flag indicates that an invalid checksum was detected in the
packet. __skb_mark_checksum_bad helper function was added to set this.
Checksums can be marked bad from a driver or the GRO path (the latter
is implemented in this patch). csum_bad is checked in
__skb_checksum_validate_complete (i.e. calling that when ip_summed ==
CHECKSUM_NONE).
csum_bad works in conjunction with ip_summed value. In the case that
ip_summed is CHECKSUM_NONE and csum_bad is set, this implies that the
first (or next) checksum encountered in the packet is bad. When
ip_summed is CHECKSUM_UNNECESSARY, the first checksum after the last
one validated is bad. For example, if ip_summed == CHECKSUM_UNNECESSARY,
csum_level == 1, and csum_bad is set-- then the third checksum in the
packet is bad. In the normal path, the packet will be dropped when
processing the protocol layer of the bad checksum:
__skb_decr_checksum_unnecessary called twice for the good checksums
changing ip_summed to CHECKSUM_NONE so that
__skb_checksum_validate_complete is called to validate the third
checksum and that will fail since csum_bad is set.
Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/dsa/dsa.c:624:20: sparse: symbol 'dsa_pack_type' was not declared.
Should it be static?
Fixes: 3e8a72d1da ("net: dsa: reduce number of protocol hooks")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix places where there is space before tab, long lines, and
awkward if(){, double spacing etc. Add blank line after declaration/initialization.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
John W. Linville says:
====================
pull request: wireless 2014-08-28
Please pull this batch of fixes intended for the 3.17 stream.
For the Bluetooth/6LowPAN/802.15.4 bits, Johan says:
'It contains a connection reference counting fix for LE where a
connection might stay up even though it should get disconnected.
The other 802.15.4 6LoWPAN related patches were sent to the bluetooth
tree by Alexander Aring and described as follows by him:
"
these patches contains patches for the bluetooth branch.
This series includes memory leak fixes and an errno value fix.
Also there are two patches for sending and receiving 1280 6LoWPAN
packets, which makes the IEEE 802.15.4 6LoWPAN stack more RFC
compliant.
"'
Along with that...
Alexey Khoroshilov fixes a use-after-free bug on at76c50x-usb.
Hauke Mehrtens adds a PCI ID to bcma.
Himangi Saraogi fixes a silly "A || A" test in rtlwifi.
Larry Finger adds a device ID to rtl8192cu.
Maks Naumov fixes a strncmp argument in ath9k.
Álvaro Fernández Rojas adds a PCI ID to ssb.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Then testing the TX limits of the stack, then it is useful to
be-able to disable the do_gettimeofday() timetamping on every packet.
This implements a pktgen flag NO_TIMESTAMP which will disable this
call to do_gettimeofday().
The performance change on (my system E5-2695) with skb_clone=0, goes
from TX 2,423,751 pps to 2,567,165 pps with flag NO_TIMESTAMP. Thus,
the cost of do_gettimeofday() or saving is approx 23 nanosec.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
TIPC name table updates are distributed asynchronously in a cluster,
entailing a risk of certain race conditions. E.g., if two nodes
simultaneously issue conflicting (overlapping) publications, this may
not be detected until both publications have reached a third node, in
which case one of the publications will be silently dropped on that
node. Hence, we end up with an inconsistent name table.
In most cases this conflict is just a temporary race, e.g., one
node is issuing a publication under the assumption that a previous,
conflicting, publication has already been withdrawn by the other node.
However, because of the (rtt related) distributed update delay, this
may not yet hold true on all nodes. The symptom of this failure is a
syslog message: "tipc: Cannot publish {%u,%u,%u}, overlap error".
In this commit we add a resiliency queue at the receiving end of
the name table distributor. When insertion of an arriving publication
fails, we retain it in this queue for a short amount of time, assuming
that another update will arrive very soon and clear the conflict. If so
happens, we insert the publication, otherwise we drop it.
The (configurable) retention value defaults to 2000 ms. Knowing from
experience that the situation described above is extremely rare, there
is no risk that the queue will accumulate any large number of items.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
We need to perform the same actions when processing deferred name
table updates, so this functionality is moved to a separate
function.
Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>