- Rearranged descriptor writes
- Moved increment command write to xgene_enet_setup_tx_desc
Signed-off-by: Iyappan Subramanian <isubramanian@apm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Stringer says:
====================
OVS conntrack support
The goal of this series is to allow OVS to send packets through the Linux
kernel connection tracker, and subsequently match on fields populated by
conntrack. This functionality is enabled through a new
CONFIG_OPENVSWITCH_CONNTRACK option.
This version addresses the feedback from v5, primarily checking the behaviour
is correct with different configurations such as disabling
CONFIG_OPENVSWITCH_CONNTRACK or disabling individual conntrack features like
connlabels.
The branch below has been updated with the corresponding userspace pieces:
https://github.com/joestringer/ovs dev/ct_20150818
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for using conntrack helpers to assist protocol detection.
The new OVS_CT_ATTR_HELPER attribute of the CT action specifies a helper
to be used for this connection. If no helper is specified, then helpers
will be automatically applied as per the sysctl configuration of
net.netfilter.nf_conntrack_helper.
The helper may be specified as part of the conntrack action, eg:
ct(helper=ftp). Initial packets for related connections should be
committed to allow later packets for the flow to be considered
established.
Example ovs-ofctl flows allowing FTP connections from ports 1->2:
in_port=1,tcp,action=ct(helper=ftp,commit),2
in_port=2,tcp,ct_state=-trk,action=ct(recirc)
in_port=2,tcp,ct_state=+trk-new+est,action=1
in_port=2,tcp,ct_state=+trk+rel,action=1
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow matching and setting the ct_label field. As with ct_mark, this is
populated by executing the CT action. The label field may be modified by
specifying a label and mask nested under the CT action. It is stored as
metadata attached to the connection. Label modification occurs after
lookup, and will only persist when the conntrack entry is committed by
providing the COMMIT flag to the CT action. Labels are currently fixed
to 128 bits in size.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add functions to change connlabel length into nf_conntrack_labels.c so
they may be reused by other modules like OVS and nftables without
needing to jump through xt_match_check() hoops.
Suggested-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Florian Westphal <fw@strlen.de>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following patches will reuse this code from OVS.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allow matching and setting the ct_mark field. As with ct_state and
ct_zone, these fields are populated when the CT action is executed. To
write to this field, a value and mask can be specified as a nested
attribute under the CT action. This data is stored with the conntrack
entry, and is executed after the lookup occurs for the CT action. The
conntrack entry itself must be committed using the COMMIT flag in the CT
action flags for this change to persist.
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Expose the kernel connection tracker via OVS. Userspace components can
make use of the CT action to populate the connection state (ct_state)
field for a flow. This state can be subsequently matched.
Exposed connection states are OVS_CS_F_*:
- NEW (0x01) - Beginning of a new connection.
- ESTABLISHED (0x02) - Part of an existing connection.
- RELATED (0x04) - Related to an established connection.
- INVALID (0x20) - Could not track the connection for this packet.
- REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
- TRACKED (0x80) - This packet has been sent through conntrack.
When the CT action is executed by itself, it will send the packet
through the connection tracker and populate the ct_state field with one
or more of the connection state flags above. The CT action will always
set the TRACKED bit.
When the COMMIT flag is passed to the conntrack action, this specifies
that information about the connection should be stored. This allows
subsequent packets for the same (or related) connections to be
correlated with this connection. Sending subsequent packets for the
connection through conntrack allows the connection tracker to consider
the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.
The CT action may optionally take a zone to track the flow within. This
allows connections with the same 5-tuple to be kept logically separate
from connections in other zones. If the zone is specified, then the
"ct_zone" match field will be subsequently populated with the zone id.
IP fragments are handled by transparently assembling them as part of the
CT action. The maximum received unit (MRU) size is tracked so that
refragmentation can occur during output.
IP frag handling contributed by Andy Zhou.
Based on original design by Justin Pettit.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This variation on skb_dst_copy() doesn't require two skbs.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This will allow the ovs-conntrack code to reuse these macros.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, we used the kernel-internal netlink actions length to
calculate the size of messages to serialize back to userspace.
However,the sw_flow_actions may not be formatted exactly the same as the
actions on the wire, so store the original actions length when
de-serializing and re-use the original length when serializing.
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
ACPI 6.0 NFIT Memory Device State Flags in Table 5-129 defines
NVDIMM status as follows. These bits indicate multiple info,
such as failures, pending event, and capability.
Bit [0] set to 1 to indicate that the previous SAVE to the
Memory Device failed.
Bit [1] set to 1 to indicate that the last RESTORE from the
Memory Device failed.
Bit [2] set to 1 to indicate that platform flush of data to
Memory Device failed. As a result, the restored data content
may be inconsistent even if SAVE and RESTORE do not indicate
failure.
Bit [3] set to 1 to indicate that the Memory Device is observed
to be not armed prior to OSPM hand off. A Memory Device is
considered armed if it is able to accept persistent writes.
Bit [4] set to 1 to indicate that the Memory Device observed
SMART and health events prior to OSPM handoff.
/sys/bus/nd/devices/nmemX/nfit/flags shows this flags info.
The output strings associated with the bits are "save", "restore",
"smart", etc., which can be confusing as they may be interpreted
as positive status, i.e. save succeeded.
Change also the dev_info() message in acpi_nfit_register_dimms()
to be consistent with the sysfs flags strings.
Reported-by: Robert Elliott <elliott@hp.com>
Signed-off-by: Toshi Kani <toshi.kani@hp.com>
[ross: rename 'not_arm' to 'not_armed']
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
[djbw: defer adding bit5, HEALTH_ENABLED, for now]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Broadcom buses may have more than 1 Ethernet device. This is used e.g.
to have few interfaces connected to different switch ports. So far we
saw chipsets with only 2 devices (e.g. BCM4706) but recent ones have
up to 3 (e.g. Netgear R8000 uses 3rd interface for most of switch
traffic, lower interfaces are for some kind of offloading).
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some of the stats handling code differs based on SR-IOV support,
and SRIOV support is only available if full-featured firmware is
used.
Do not use vadaptor stats if firmware mode is not set to
full-featured.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
iwlwifi:
* new Tx power firmware API
* bump max firmware API to 17
* fix bug in debug prints
* static checker fix
* fix unused defines
* fix command list on newest firmware
brcmfmac:
* support NVRAM loading for bcm47xx platform
* new debugfs entry for msgbuf protocol layer used with PCIe devices
ath10k:
* add spectral scan support for qca99x0
* add qca6164 support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQEcBAABAgAGBQJV3dEKAAoJEG4XJFUm622bKpcH/2+9lcdUY0xplYPOA+eXIJoT
lA7fvFzm3Ez17OPOu+RCSYGnsiYHW47ayCdKqBcVkVIiE5kiRaztvDL7NpCN1mCY
SIJgixcL0aRinDqc2HZBJ2dG1AUeM7p3SxupuHFovtI2FBs8O+sWcx7/tJ26+f+Y
tSRwcZkWKNUOwqjQckFfD/k5PpAo06hxg8Zw85N+qNhWB8NyVgIZqHID1vOwQc02
rtqHl0BZxz2fKmsAeRKpZGlxFqYwc2Y/QT5YLgaOEDQ+gmoUsJKHUIge4fL7zTTG
I6UUTUbtu83SMKTAeg6BztM+JklEZRNRL5VLboE36dSmSsUBbZxi/WKf+OG70SE=
=W1bH
-----END PGP SIGNATURE-----
Merge tag 'wireless-drivers-next-for-davem-2015-08-26' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next
Kalle Valo says:
====================
Major changes:
iwlwifi:
* new Tx power firmware API
* bump max firmware API to 17
* fix bug in debug prints
* static checker fix
* fix unused defines
* fix command list on newest firmware
brcmfmac:
* support NVRAM loading for bcm47xx platform
* new debugfs entry for msgbuf protocol layer used with PCIe devices
ath10k:
* add spectral scan support for qca99x0
* add qca6164 support
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
The fixed link values parsed from the device tree are stored in
the struct fixed_phy member status. The struct phy_device members
speed, duplex were not updated.
Signed-off-by: Madalin Bucur <madalin.bucur@freescale.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit c214ebe1eb29 ("batman-adv: move neigh_node list add into
batadv_neigh_node_new()") removed external calls to
batadv_neigh_node_get().
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
The maximum of hard_header_len and maximum of all needed_(head|tail)room of
all slave interfaces of a batman-adv device must be used to define the
batman-adv device needed_(head|tail)room. This is required to avoid too
small buffer problems when these slave devices try to send the encapsulated
packet in a tx path without the possibility to resize the skbuff.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
In batadv_hardif_disable_interface() there is a call to
batadv_softif_destroy_sysfs() which in turns invokes
unregister_netdevice() on the soft_iface.
After this point we cannot rely on the soft_iface object
anymore because it might get free'd by the netdev periodic
routine at any time.
For this reason the netdev_upper_dev_unlink(.., soft_iface) call
is moved before the invocation of batadv_softif_destroy_sysfs() so
that we can be sure that the soft_iface object is still valid.
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
commit 0511575c4d03 ("batman-adv: remove obsolete deleted attribute for
gateway node") incorrectly added an empy line and forgot to remove an
include.
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Acked-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
With rcu, the gateway node deleted attribute is not needed anymore. In
fact, it may delay the free of the gateway node and its referenced
structures. Therefore remove it altogether and simplify purging as well.
Signed-off-by: Simon Wunderlich <simon@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
All batadv_neigh_node_* functions expect the neigh_node list item to be part
of the orig_node->neigh_list, therefore the constructor of said list item
should be adding the newly created neigh_node to the respective list.
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Acked-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
The batadv_neigh_node_new() function already sets the hard_iface pointer.
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Acked-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
The batadv_neigh_node cleanup function 'batadv_neigh_node_free_rcu()'
takes care of reducing the hardif refcounter, hence it's only logical
to assume the creating function of that same object
'batadv_neigh_node_new()' takes care of increasing the same refcounter.
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Acked-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
Fix arm64 KVM issue when injecting an abort into a 32-bit guest, which
would lead to an illegal exception return at EL2 and a subsequent host
crash.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABCgAGBQJV3yqAAAoJELescNyEwWM0Z84H/i/6Aleyuu9b1JvpFAbLJSCq
tV9oXVIo8o0kIfN9B4YSuHrFCCizVukLczKm10o5NCT559WXCWX7C0h2jpoaqIWm
I0cKZWlBtp6JANATG5c7RLW5WdjuKFAtK6Pg7oPcaceqO6EsIyE+z9yu5UCRRDyk
Tyl8WRRbPwfmyFUMNYtm/Oo3RUPqBXCE+CBMiTVq31fUblPwEgP2y6JGgGG2Vjx8
fMwdu+nExlw7InBah8E2CcLJPWYxfDq9OKUok0zQScd5fJMq8ueP6rpunM2Iup2X
0AJY+pD8vBk7l4Rkq3eCYqTEyJus3ANHE8auYi4i3hr2Hsyzy453Zz/ITv3b9T4=
=MhCr
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull amr64 kvm fix from Will Deacon:
"We've uncovered a nasty bug in the arm64 KVM code which allows a badly
behaved 32-bit guest to bring down the host. The fix is simple (it's
what I believe we call a "brown paper bag" bug) and I don't think it
makes sense to sit on this, particularly as Russell ended up
triggering this rather than just somebody noticing a potential problem
by inspection.
Usually arm64 KVM changes would go via Paolo's tree, but he's on
holiday at the moment and the deal is that anything urgent gets
shuffled via the arch trees, so here it is.
Summary:
Fix arm64 KVM issue when injecting an abort into a 32-bit guest, which
would lead to an illegal exception return at EL2 and a subsequent host
crash"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: KVM: Fix host crash when injecting a fault into a 32bit guest
When injecting a fault into a misbehaving 32bit guest, it seems
rather idiotic to also inject a 64bit fault that is only going
to corrupt the guest state. This leads to a situation where we
perform an illegal exception return at EL2 causing the host
to crash instead of killing the guest.
Just fix the stupid bug that has been there from day 1.
Cc: <stable@vger.kernel.org>
Reported-by: Russell King <rmk+kernel@arm.linux.org.uk>
Tested-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Make sure to indicate to tunnel driver that key.tun_id is set,
otherwise gre won't recognize the metadata.
Fixes: d3aa45ce6b ("bpf: add helpers to access tunnel metadata")
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bump version and update the copyright year for i40evf.
Change-ID: Iddb81b9dba09f0dc57ab54937b5821ecdd721ff6
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Store the CEE TLV status returned by firmware to allow drivers to dump that
for debug purposes.
Change-ID: Ie3c4cf8cebabee4f15e1e3fdc4fc8a68bbca40ee
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The kernel notifies all VXLAN capable registered drivers, i.e. any
driver that implements ndo_add_vxlan_port(), of the addition of a
port so that the driver can track which ports are in use. There's
no need to log this - it just fills the system log with useless and
irksome noise.
Also, when failing to init SR-IOV interfaces the driver was printing the
same message twice. Just remove the inner printk and let the outer message
catch enable as well as the other failures.
Change-ID: Id5ecb1d425c2a357ee2bc1635dab24553831dade
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
There were quite a few issues when the wrong defines were getting used
in the VF driver. This patch fixes the code where PF driver registers
were getting used for VF driver, and also removes the registers that are
not being used from the VF register file.
Change-ID: If116a9730112950d006eb8ec763998fc914cc839
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Acked-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Use CTLN1 instead of CTLN for the VF relative register space.
Change-ID: Iefba63faf0307af55fec8dbb64f26059f7d91318
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Turns out that 'inavlid' is an inavlid spelling for 'invalid'.
Change-ID: Ie1fe2d0f8d1ba75ab880594875ec2e4152a76f61
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The existing comment is incorrect. Add new comment to point out that the
PF reset does not affect link but if the reset is changed to a different
type that does affect link then the link test would need to be moved to
before the reset.
Change-ID: I28d786f46e9465860babdee61c1dba51016464df
Reported-by: Jeremiah Kyle <jeremiah.kyle@intel.com>
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch adds capability to update per VEB per TC statistics and dump
it via ethtool. It also adds a structure to hold VEB per TC statistics.
The fields can be filled by reading the GLVEBTC_* counters.
Change-ID: I28b4759b9ab6ad5a61f046a1bc9ef6b16fe31538
Signed-off-by: Neerav Parikh <neerav.parikh@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Treat netqueues the same way we do virtual functions when someone wants to
run the ethtool offline diagnostic test.
Change-ID: Id48d2b933f1fd0db7be06305a93c6ebe3dc821f5
Signed-off-by: Greg Rose <gregory.v.rose@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch fixes the driver flow to take into account legacy interrupts.
Over time we added code that assumes MSIX is the only mode that the
driver runs in. It also enables a legacy workaround to trigger SWINT
when the TX ring has non-cache aligned descriptors pending and interrupts
are disabled.
We work with a single vector in MSI mode too, so apply the same
restrictions as Legacy.
Change-ID: I826ddff1f9bd45d2dbe11f56a3ddcef0dbf42563
Signed-off-by: Anjali Singhai Jain <anjali.singhai@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We should be stopping the service task and flow director on
shutdown not on suspension.
Signed-off-by: Catherine Sullivan <catherine.sullivan@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The port.crc_errors is really an RX counter, so let's mark it as such.
Change-ID: I179afd3f8a95d45229bb4163a6aeb01f0d2d250b
Signed-off-by: Shannon Nelson <shannon.nelson@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Sparse cries when we compare an __le16 to a u16, almost like it cares
about architectures other than x86. Weird. Use the le16_to_cpu macro to
make it stop crying.
Change-ID: Id068f4d7868a2d3df234a791a76d15938f37db35
Signed-off-by: Mitch Williams <mitch.a.williams@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Simon Horman says:
====================
Second Round of IPVS Updates for v4.3
I realise these are a little late in the cycle, so if you would prefer
me to repost them for v4.4 then just let me know.
The updates include:
* A new scheduler from Raducu Deaconu
* Enhanced configurability of the sync daemon from Julian Anastasov
====================
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
RFC 4443 added two new codes values for ICMPv6 type 1:
5 - Source address failed ingress/egress policy
6 - Reject route to destination
And RFC 7084 states in L-14 that IPv6 Router MUST send ICMPv6 Destination
Unreachable with code 5 for packets forwarded to it that use an address
from a prefix that has been invalidated.
Codes 5 and 6 are more informative subsets of code 1.
Signed-off-by: Andreas Herz <andi@geekosphere.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pull block fixes from Jens Axboe:
"Two fixes in this pull request:
- The writeback regression fix from Tejun, which has been weeks in
the making. This fixes a case where we would sometimes not issue
writeback when we should have.
- An older fix for a memory corruption issue in mtip32xx. It was
deferred since we wanted a better fix for this (driver should not
have to handle that case), but given the timing, it's better to put
the simple fix in for 4.2 release"
* 'for-linus' of git://git.kernel.dk/linux-block:
mtip32x: fix regression introduced by blk-mq per-hctx flush
writeback: sync_inodes_sb() must write out I_DIRTY_TIME inodes and always call wait_sb_inodes()
Alexei Starovoitov says:
====================
act_bpf: remove spinlock in fast path
v1 version had a race condition in cleanup path of bpf_prog.
I tried to fix it by adding new callback 'cleanup_rcu' to 'struct tcf_common'
and call it out of act_api cleanup path, but Daniel noticed
(thanks for the idea!) that most of the classifiers already do action cleanup
out of rcu callback.
So instead this set of patches converts tcindex and rsvp classifiers to call
tcf_exts_destroy() after rcu grace period and since action cleanup logic
in __tcf_hash_release() is only called when bind and refcnt goes to zero,
it's guaranteed that cleanup() callback is called from rcu callback.
More specifically:
patches 1 and 2 - simple fixes
patches 2 and 3 - convert tcf_exts_destroy in tcindex and rsvp to call_rcu
patch 5 - removes spin_lock from act_bpf
The cleanup of actions is now universally done after rcu grace period
and in the future we can drop (now unnecessary) call_rcu from tcf_hash_destroy()
patch 5 is using synchronize_rcu() in act_bpf replacement path, since it's
very rare and alternative of dynamically allocating 'struct tcf_bpf_cfg' just
to pass it to call_rcu looks even less appealing.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to act_gact/act_mirred, act_bpf can be lockless in packet processing
with extra care taken to free bpf programs after rcu grace period.
Replacement of existing act_bpf (very rare) is done with synchronize_rcu()
and final destruction is done from tc_action_ops->cleanup() callback that is
called from tcf_exts_destroy()->tcf_action_destroy()->__tcf_hash_release() when
bind and refcnt reach zero which is only possible when classifier is destroyed.
Previous two patches fixed the last two classifiers (tcindex and rsvp) to
call tcf_exts_destroy() from rcu callback.
Similar to gact/mirred there is a race between prog->filter and
prog->tcf_action. Meaning that the program being replaced may use
previous default action if it happened to return TC_ACT_UNSPEC.
act_mirred race betwen tcf_action and tcfm_dev is similar.
In all cases the race is harmless.
Long term we may want to improve the situation by replacing the whole
tc_action->priv as single pointer instead of updating inner fields one by one.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjust destroy path of cls_rsvp to call tcf_exts_destroy() after
rcu grace period.
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>