Introduce TP_STATUS_CSUM_VALID tp_status flag to tell the
af_packet user that at least the transport header checksum
has been already validated.
For now, the flag may be set for incoming packets only.
Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is just an optimization. We don't need the value of status variable
if the packet is filtered.
Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter's static checker warned that in vxlan_stop we are checking
if 'vs' can be NULL while later we simply derreference it.
As after commit 56ef9c909b ("vxlan: Move socket initialization to
within rtnl scope") 'vs' just cannot be NULL in vxlan_stop() anymore, as
the interface won't go up if the socket initialization fails. So we are
good to just remove the check and make it consistent.
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When I updated the code to address a possible null pointer dereference in
resize I ended up reverting an exception handling fix for the suffix length
in the event that inflate or halve failed. This change is meant to correct
that by reverting the earlier fix and instead simply getting the parent
again after inflate has been completed to avoid the possible null pointer
issue.
Fixes: ddb4b9a13 ("fib_trie: Address possible NULL pointer dereference in resize")
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Always use software checksumming, since the hardware does not have any
checksum offload support.
This significantly improves local TCP tx performance.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The start-of-frame and end-of-frame bits were accidentally swapped.
In the current code it does not make any difference, since they are
always used together.
Signed-off-by: Felix Fietkau <nbd@openwrt.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We implement the SO_SNDLOWAT etc not to be settable and return
ENOPROTOOPT per 1003.1g 7. Move the comment to appropriate
position and update the reference.
Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet says:
====================
tcp listener refactoring part 15
I am trying to make the final patch pushing request socks into ehash
as small as possible. In this patch series, I made various adjustments
for the SYNACK generation, allowing me to reach 1 Mpps SYNACK in my
stress test (still hitting LISTENER spinlock of course, and the syn_wait
spinlock)
I also converted the ICMP handlers a bit ahead of time :
They no longer need to get the LISTENER socket, and can use
only a lookup in ehash table. No big deal if we ignore ICMP
for requests socks before the final steps.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
dccp_v6_err() can restrict lookups to ehash table, and not to listeners.
Note this patch creates the infrastructure, but this means that ICMP
messages for request sockets are ignored until complete conversion.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
dccp_v4_err() can restrict lookups to ehash table, and not to listeners.
Note this patch creates the infrastructure, but this means that ICMP
messages for request sockets are ignored until complete conversion.
New dccp_req_err() helper is exported so that we can use it in IPv6
in following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_v6_err() can restrict lookups to ehash table, and not to listeners.
Note this patch creates the infrastructure, but this means that ICMP
messages for request sockets are ignored until complete conversion.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
tcp_v4_err() can restrict lookups to ehash table, and not to listeners.
Note this patch creates the infrastructure, but this means that ICMP
messages for request sockets are ignored until complete conversion.
New tcp_req_err() helper is exported so that we can use it in IPv6
in following patch.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is a low hanging fruit, as we'll get rid of syn_wait_lock eventually.
We hold syn_wait_lock for such small sections, that it makes no sense to use
a read/write lock. A spin lock is simply faster.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
listener can be source of false sharing. request sock has some
useful information like : ireq->ir_iif, ireq->ir_num, ireq->ireq_net
This patch does not solve the major problem of having to read
sk->sk_protocol which is sharing a cache line with sk->sk_wmem_alloc.
(This same field is read later in ip_build_and_send_pkt())
One idea would be to move sk_protocol close to sk_family
(using 8 bits instead of 16 for sk_family seems enough)
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is not needed, and req->sk_listener points to the listener anyway.
request_sock argument can be const.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cache listen_sock_qlen() to limit false sharing, and read
rskq_defer_accept once as it might change under us.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tilman Schmidt says:
====================
isdn/gigaset: restructure modem response parser
This series of patches restructures the Gigaset ISDN driver's
modem response parser to improve code readability and conform
better to the device's specification and actual behaviour.
Could you please merge these through net-next?
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Restructure the control structure of the modem response parser
to improve readability and error handling.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate literal string handling from main parser loop.
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
Factor out queueing of modem response events into helper function
add_cid_event().
Signed-off-by: Tilman Schmidt <tilman@imap.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
make it same as the netdev_switch_port_bridge_setlink/dellink
api (ie traverse lowerdevs to get to the switch port).
removes "WARN_ON(!ops->ndo_switch_parent_id_get)" because
direct bridge ports can be stacked netdevices (like bonds
and team of switch ports) which may not implement this ndo.
v2 to v3:
- remove changes to bond and team. Bring back the
transparently following lowerdevs like i initially
had for setlink/getlink
(http://www.spinics.net/lists/netdev/msg313436.html)
dave and scott feldman also seem to prefer it be that
way and move to non-transparent way of doing things
if we see a problem down the lane.
v3 to v4:
- fix ret initialization
v4 to v5:
- return err on first failure (scott feldman)
v5 to v6:
- change variable name (err) and initialize to
-EOPNOTSUPP (scott feldman).
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Scott Feldman <sfeldma@gmail.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
skb->priority can be set for two purposes:
1) With respect to IP TOS field, which is computed by a mask.
Ususally used for priority qdisc's (pfifo, prio etc.), on TX
side (we only have ingress qdisc on RX side).
2) Used as a classid or flowid, works in the same way with tc
classid. What's more, this can even override the classid
of tc filters.
For case 1), it has been respected within its netns, I don't
see any point of keeping it for another netns, especially
when packets will be forwarded to Rx path (no matter from TX
path or RX path).
For case 2) we care, our applications run inside a netns,
and we classify the packets by our own filters outside,
If some application sets this priority, it could bypass
our filters, therefore clear it when moving out of a netns,
it makes no sense to bypass tc filters out of its netns.
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Tadeusz Struk says:
====================
Add support for async socket operations
After the iocb parameter has been removed from sendmsg() and recvmsg() ops
the socket layer, and the network stack no longer support async operations.
This patch set adds support for asynchronous operations on sockets back.
Changes in v3:
* As sugested by Al Viro instead of adding new functions aio_sendmsg
and aio_recvmsg, added a ptr to iocb into the kernel-side msghdr structure.
This way no change to aio.c is required.
Changes in v2:
* removed redundant total_size param from aio_sendmsg and aio_recvmsg functions
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The way the algif_skcipher works currently is that on sendmsg/sendpage it
builds an sgl for the input data and then on read/recvmsg it sends the job
for encryption putting the user to sleep till the data is processed.
This way it can only handle one job at a given time.
This patch changes it to be asynchronous by adding AIO support.
Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of computing the offset from trailer, this patch computes
netlink_compare_arg_len from the offset of portid and then adds 4
to it. This allows trailer to be removed.
Reported-by: David Miller <davem@davemloft.net>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Murali Karicheri says:
====================
NetCP: Add support for version 1.5
NetCP 1.5 is used in newer K2 SoCs from Texas Instruments
such as K2E, K2L etc. This patch series add support for Ethss
driver for this version of NetCP. While at it, fix couple of
bugs in the original driver.
One of the earlier patch "net: netcp: select davinci_mdio driver
by default" is folded onto this series.
Please review and let me know your comments.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
NetCP 1.5 available on newer K2 SoCs such as K2E and K2L introduced 3
variants of the ethss subsystem, 9 port, 5 port and 2 port. These have
one host port towards the CPU and N external slave ports.
To customize the driver for these new ethss sub systems, multiple
compatibility strings are introduced. Currently some of parameters that
are different on different variants such as number of ALE ports, stats
modules and number of ports are defined through constants. These are now
changed to variables in gbe_priv data that get set based on the
compatibility string. This is required as there are no hardware
identification registers available to distinguish among the variants
of NetCP 1.5 ethss. However there is identification register available
to differentiate between NetCP 1.4 vs NetCP 1.5 and the same is made use
of in the code to differentiate them.
For more reading on the details of this peripheral, please refer to the
User Guide available at http://www.ti.com/lit/pdf/spruhz3
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Mugunthan V N <mugunthanvnm@ti.com>
CC: "Lad, Prabhakar" <prabhakar.csengg@gmail.com>
CC: Grygorii Strashko <grygorii.strashko@ti.com>
CC: Christoph Jaeger <cj@linux.com>
CC: Lokesh Vutla <lokeshvutla@ti.com>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Kumar Gala <galak@codeaurora.org>
CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
CC: Mark Rutland <mark.rutland@arm.com>
CC: Pawel Moll <pawel.moll@arm.com>
CC: Rob Herring <robh+dt@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix following checkpatch error. It seems to have passed checkpatch
last time when original code was introduced.
ERROR: Macros with complex values should be enclosed in parentheses
#172: FILE: drivers/net/ethernet/ti/netcp_ethss.c:869:
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Mugunthan V N <mugunthanvnm@ti.com>
CC: "Lad, Prabhakar" <prabhakar.csengg@gmail.com>
CC: Grygorii Strashko <grygorii.strashko@ti.com>
CC: Christoph Jaeger <cj@linux.com>
CC: Lokesh Vutla <lokeshvutla@ti.com>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Kumar Gala <galak@codeaurora.org>
CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
CC: Mark Rutland <mark.rutland@arm.com>
CC: Pawel Moll <pawel.moll@arm.com>
CC: Rob Herring <robh+dt@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Keystone netcp driver re-uses davinci mdio driver. So enable it
by default for keystone netcp driver.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Mugunthan V N <mugunthanvnm@ti.com>
CC: "Lad, Prabhakar" <prabhakar.csengg@gmail.com>
CC: Grygorii Strashko <grygorii.strashko@ti.com>
CC: Christoph Jaeger <cj@linux.com>
CC: Lokesh Vutla <lokeshvutla@ti.com>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Kumar Gala <galak@codeaurora.org>
CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
CC: Mark Rutland <mark.rutland@arm.com>
CC: Pawel Moll <pawel.moll@arm.com>
CC: Rob Herring <robh+dt@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ethss has multiple modules within the sub system
- switch sub system
- sgmii
- mdio
- switch module
NetCP driver re-uses existing davinci mdio driver. It requires to
have its own register region to map the reg space. So restructure
the code to use separate reg region for the individual modules it
manages. Use range property to define register space of NetCP and
use reg property to define individual reg spaces. So MDIO will have
its own reg space to map. This is a pre-requisite to enable MDIO
driver for NetCP.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Mugunthan V N <mugunthanvnm@ti.com>
CC: "Lad, Prabhakar" <prabhakar.csengg@gmail.com>
CC: Grygorii Strashko <grygorii.strashko@ti.com>
CC: Christoph Jaeger <cj@linux.com>
CC: Lokesh Vutla <lokeshvutla@ti.com>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Kumar Gala <galak@codeaurora.org>
CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
CC: Mark Rutland <mark.rutland@arm.com>
CC: Pawel Moll <pawel.moll@arm.com>
CC: Rob Herring <robh+dt@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
10G switch requires forward port number in the taginfo field,
where as it should be in packet_info field for necp 1.4 Ethss. So
fill this value correctly in the knav dma descriptor.
Also rename dma_psflags field in struct netcp_tx_pipe to switch_to_port
as it contain no flag, but the switch port number for forwarding the
packet. Add a flag to hold the new flag, SWITCH_TO_PORT_IN_TAGINFO which
will be set for 10G. This can also used in the future for other flags for
the tx_pipe.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Mugunthan V N <mugunthanvnm@ti.com>
CC: "Lad, Prabhakar" <prabhakar.csengg@gmail.com>
CC: Grygorii Strashko <grygorii.strashko@ti.com>
CC: Christoph Jaeger <cj@linux.com>
CC: Lokesh Vutla <lokeshvutla@ti.com>
CC: Markus Pargmann <mpa@pengutronix.de>
CC: Kumar Gala <galak@codeaurora.org>
CC: Ian Campbell <ijc+devicetree@hellion.org.uk>
CC: Mark Rutland <mark.rutland@arm.com>
CC: Pawel Moll <pawel.moll@arm.com>
CC: Rob Herring <robh+dt@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We send unicast neighbor (ARP or NDP) solicitations ucast_probes
times in PROBE state. Zhu Yanjun reported that some implementation
does not reply against them and the entry will become FAILED, which
is undesirable.
We had been dealt with such nodes by sending multicast probes mcast_
solicit times after unicast probes in PROBE state. In 2003, I made
a change not to send them to improve compatibility with IPv6 NDP.
Let's introduce per-protocol per-interface sysctl knob "mcast_
reprobe" to configure the number of multicast (re)solicitation for
reconfirmation in PROBE state. The default is 0, since we have
been doing so for 10+ years.
Reported-by: Zhu Yanjun <Yanjun.Zhu@windriver.com>
CC: Ulf Samuelsson <ulf.samuelsson@ericsson.com>
Signed-off-by: YOSHIFUJI Hideaki <hideaki.yoshifuji@miraclelinux.com>
Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Home routers based on ARM SoCs like BCM4708 also have bcma bus with core
supported by bgmac.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
On ARM SoCs with bgmac Ethernet hardware we don't have any normal PHY.
There is always a switch attached but it's not even controlled over MDIO
like in case of MIPS devices.
We need a fixed PHY to be able to send/receive packets from the switch.
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
====================
Intel Wired LAN Driver Updates 2015-03-20
This series contains updates to ixgb, e1000e, igb and igbvf.
Eliezer and Todd provide patches to fix a potential issue found during code
inspection. When bringing down an interface netif_carrier_off() should
be one of the first things we do, since this will prevent the stack from
queueing more packets to this interface.
Yanir provides a fix for e1000e that was found in validating i219,
where the call to e1000e_write_protect_nvm_ich8lan() is no longer
supported in newer hardware. Access to these registers causes a
system freeze in early steppings and is ignored in later steppings.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Change bd76a11670 made all DSA drivers
depend on NET_DSA rather than selecting them. However, as the only way
to select this option was to actually select a driver, it made DSA
impossible to enable at all.
This patch adds an explicit entry which the user will have to enable
prior selecting a driver.
Signed-off-by: Mathieu Olivari <mathieu@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Suggested-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit ca10b9e9a8.
No longer needed after commit eb8895debe
("tcp: tcp_make_synack() should use sock_wmalloc")
When under SYNFLOOD, we build lot of SYNACK and hit false sharing
because of multiple modifications done on sk_listener->sk_wmem_alloc
Since tcp_make_synack() uses sock_wmalloc(), there is no need
to call skb_set_owner_w() again, as this adds two atomic operations.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use netif_carrier_off() first, since that will prevent the stack from
queuing more packets to this IF. This operation is fast, and should
behave much nicer when trying to bring down an interface under load.
Reported-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use netif_carrier_off() first, since that will prevent the stack from
queuing more packets to this IF. This operation is fast, and should
behave much nicer when trying to bring down an interface under load.
Reported-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The call to e1000e_write_protect_nvm_ich8lan() is no longer supported by HW.
Access to these registers causes a system freeze in A step hardware and is
ignored in B step hardware. This function must not be called in hardware
newer than LPT.
Signed-off-by: Yanir Lubetkin <yanirx.lubetkin@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When bringing down an interface netif_carrier_off() should be
one the first things we do, since this will prevent the stack
from queuing more packets to this interface.
This operation is very fast, and should make the device behave
much nicer when trying to bring down an interface under load.
Also, this would Do The Right Thing (TM) if this device has some
sort of fail-over teaming and redirect traffic to the other IF.
Move netif_carrier_off as early as possible.
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When bringing down an interface netif_carrier_off() should be
one the first things we do, since this will prevent the stack
from queuing more packets to this interface.
This operation is very fast, and should make the device behave
much nicer when trying to bring down an interface under load.
Also, this would Do The Right Thing (TM) if this device has some
sort of fail-over teaming and redirect traffic to the other IF.
Move netif_carrier_off as early as possible.
Signed-off-by: Eliezer Tamir <eliezer.tamir@linux.intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Daniel Borkmann says:
====================
This set adds native eBPF support also to act_bpf and thus covers tc
with eBPF in the classifier *and* action part.
A link to iproute2 preview has been provided in patch 2 and the code
will be pushed out after Stephen has processed the classifier part
and helper bits for tc.
This set depends on ced585c83b ("act_bpf: allow non-default TC_ACT
opcodes as BPF exec outcome"), so a net into net-next merge would be
required first. Hope that's fine by you, Dave. ;)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>