Rahul Lakkireddy says:
====================
This series of patches add UDP Segmentation Offload (USO) supported
by Chelsio T5/T6 NICs.
Patch 1 updates the current Scatter Gather List (SGL) DMA unmap logic
for USO requests.
Patch 2 adds USO support for NIC and MQPRIO QoS offload Tx path.
Patch 3 adds missing stats for MQPRIO QoS offload Tx path.
====================
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Implement and export UDP segmentation offload (USO) support for both
NIC and MQPRIO QoS offload Tx path. Update appropriate logic in Tx to
parse GSO info in skb and configure FW_ETH_TX_EO_WR request needed to
perform USO.
v2:
- Remove inline keyword from write_eo_udp_wr() in sge.c. Let the
compiler decide.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
The FW_ETH_TX_EO_WR used for sending UDP Segmentation Offload (USO)
requests expects the headers to be part of the descriptor and the
payload to be part of the SGL containing the DMA mapped addresses.
Hence, the DMA address in the first entry of the SGL can start after
the packet headers. Currently, unmap_sgl() tries to unmap from this
wrong offset, instead of the originally mapped DMA address.
So, use existing unmap_skb() instead, which takes originally saved DMA
addresses as input. Update all necessary Tx paths to save the original
DMA addresses, so that unmap_skb() can unmap them properly.
v2:
- No change.
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Minor conflict in drivers/s390/net/qeth_l2_main.c, kept the lock
from commit c8183f5489 ("s390/qeth: fix potential deadlock on
workqueue flush"), removed the code which was removed by commit
9897d583b0 ("s390/qeth: consolidate some duplicated HW cmd code").
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Pull networking fixes from David Miller:
1) Validate tunnel options length in act_tunnel_key, from Xin Long.
2) Fix DMA sync bug in gve driver, from Adi Suresh.
3) TSO kills performance on some r8169 chips due to HW issues, disable
by default in that case, from Corinna Vinschen.
4) Fix clock disable mismatch in fec driver, from Chubong Yuan.
5) Fix interrupt status bits define in hns3 driver, from Huazhong Tan.
6) Fix workqueue deadlocks in qeth driver, from Julian Wiedmann.
7) Don't napi_disable() twice in r8152 driver, from Hayes Wang.
8) Fix SKB extension memory leak, from Florian Westphal.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (54 commits)
r8152: avoid to call napi_disable twice
MAINTAINERS: Add myself as maintainer of virtio-vsock
udp: drop skb extensions before marking skb stateless
net: rtnetlink: prevent underflows in do_setvfinfo()
can: m_can_platform: remove unnecessary m_can_class_resume() call
can: m_can_platform: set net_device structure as driver data
hv_netvsc: Fix send_table offset in case of a host bug
hv_netvsc: Fix offset usage in netvsc_send_table()
net-ipv6: IPV6_TRANSPARENT - check NET_RAW prior to NET_ADMIN
sfc: Only cancel the PPS workqueue if it exists
nfc: port100: handle command failure cleanly
net-sysfs: fix netdev_queue_add_kobject() breakage
r8152: Re-order napi_disable in rtl8152_close
net: qca_spi: Move reset_count to struct qcaspi
net: qca_spi: fix receive buffer size check
net/ibmvnic: Ignore H_FUNCTION return from H_EOI to tolerate XIVE mode
Revert "net/ibmvnic: Fix EOI when running in XIVE mode"
net/mlxfw: Verify FSM error code translation doesn't exceed array size
net/mlx5: Update the list of the PCI supported devices
net/mlx5: Fix auto group size calculation
...
By default s_maxbytes is set to MAX_NON_LFS, which limits the usable
file size to 2GB, enforced by the vfs.
Commit b9b1f8d593 ("AFS: write support fixes") added support for the
64-bit fetch and store server operations, but did not change this value.
As a result, attempts to write past the 2G mark result in EFBIG errors:
$ dd if=/dev/zero of=foo bs=1M count=1 seek=2048
dd: error writing 'foo': File too large
Set s_maxbytes to MAX_LFS_FILESIZE.
Fixes: b9b1f8d593 ("AFS: write support fixes")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Servers sending callback breaks to the YFS_CM_SERVICE service may
send up to YFSCBMAX (1024) fids in a single RPC. Anything over
AFSCBMAX (50) will cause the assert in afs_break_callbacks to trigger.
Remove the assert, as the count has already been checked against
the appropriate max values in afs_deliver_cb_callback and
afs_deliver_yfs_cb_callback.
Fixes: 35dbfba311 ("afs: Implement the YFS cache manager service")
Signed-off-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Update FW API minor version to align to current value advertised
by FW in new NVM images.
Signed-off-by: Kevin Scott <kevin.c.scott@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The code in ice_sched_cleanup_all checks whether the port info is NULL
prior to calling ice_sched_clear_port. However, ice_sched_clear_port
already checks whether port info is non-NULL.
More importantly, it also checks whether the port structure has been
initialized by checking its port_state field as well.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Add code to query and set the number of channels on the primary VSI for a
PF. This is accessed from the 'ethtool -l' and 'ethtool -L' commands,
respectively. Though the ice driver supports asymmetric queues report an
IRQ vector that has both Rx and Tx queues attached and is counted as a
'combined' channel.
Signed-off-by: Henry Tieman <henry.w.tieman@intel.com>
Co-developed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
When code reaches the "out" label, n is guaranteed to be valid so we can
unconditionally call neigh_release.
Also change the label to release_neigh to better reflect the fact that
we unconditionally free the neighbour and also match other labels
convention.
Signed-off-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Improve mlx5e_route_lookup_ipv6 function structure by avoiding #ifdef then
return -EOPNOTSUPP in the middle of the function code.
To do so, we stub out mlx5e_tc_tun_create_header_ipv6 which is the only
caller of this helper function to avoid calling it altogether
when ipv6 is compiled out, which should also cleanup some compiler
warnings of unused variables.
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Eli Cohen <eli@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Refactor flex parser tunnel code:
- Add definition for flex parser tunneling header for VXLAN-GPE
- Use macros for VXLAN-GPE SW steering when building STE
- Refactor the code to reflect that this is a VXLAN GPE
only code and not a general flex parser code.
This also significantly simplifies addition of more
flex parser protocols, such as Geneve.
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
The MODIFY_HCA_VPORT_CONTEXT uses field_selector to mask fields needed
to be written, other fields are required to be zero according to the
HW specification. The supported fields are controlled by bitfield
and limited to vport state, node and port GUIDs.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Implement the VF stats gathering via the kernel via ndo_get_vf_stats().
The driver will show per-VF stats in the output of the
ip -s link show dev <PF> command.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The virtchannel interface was repeating a lot of strings
and wasting storage space in the kernel. There was also
inconsistent messages for the same thing. Consolidate all
those messages and bit checks into a couple of helper functions.
Also, reduce stack space usage by simplifying getting the pointer
to the pf using a helper.
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Co-developed-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
We use &pf->dev->pdev all over the code. Add a simple
macro to do this for us. When multiple de-references
like this are being done add a local struct device
variable.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In situations where we alloc and free memory within the same function do
not use the devm_* variants; use regular alloc and free functions. Remove
any unused vars if there are no usages after these changes.
Also, replace an allocate and copy with kmemdup() and remove an
unnecessary memset() to 0 after a kzalloc().
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently ice_clear_vsi_promisc() detects if the VLAN ID sent is not 0
and sets the recipe_id to ICE_SW_LKUP_PROMISC_VLAN in that case and
ICE_SW_LKUP_PROMISC if the VLAN_ID is 0. However this doesn't allow VLAN
0 promiscuous rules to be removed, but they can be added. Fix this by
checking if the promisc_mask contains ICE_PROMISC_VLAN_RX or
ICE_PROMISC_VLAN_TX. This change was made to match what is being done
for ice_set_vsi_promisc().
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently there can be a case where a DCB map is applied and there are
more interrupt vectors (vsi->num_q_vectors) than Rx queues (vsi->num_rxq)
and Tx queues (vsi->num_txq). If we try to set coalesce settings in this
case it will report a false failure. Fix this by checking if vector index
is valid with respect to the number of Tx and Rx queues configured.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
It is wrong to set PF disable state flag for all VFs when freeing VF
resources - Instead, we should set VF disable state flag for each VF with
its resources being returned to the device. Right now, all VF opcodes,
mailbox communication to clear its resources as well fails - since we
already indicate that PF is in disable state, with all VFs not active. In
addition, we don't need to notify VF that PF is intending to reset it, if
it is already in disabled state.
Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
In the case of an invalid virtchannel request the driver
would return uninitialized data to the VF from the PF stack
which is a bug. Fix by initializing the stack variable
earlier in the function before any return paths can be taken.
Fixes: 1071a8358a ("ice: Implement virtchnl commands for AVF support")
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently when adding/deleting vlans in ice_vc_process_vlan_msg()
we are calling ice_vsi_manage_vlan_stripping() to enable/disable
when adding and deleting a VLAN respectively. This is wrong
because adding/deleting VLANs has nothing to do with configuring
VLAN stripping. VLAN stripping is configured through the
following VIRTCHNL operations:
VIRTCHNL_OP_ENABLE_VLAN_STRIPPING
VIRTCHNL_OP_DISABLE_VLAN_STRIPPING
Unfortunately we can't just remove this because then stripping
will never be configured on VF initialization. Fix this by
adding a new function that initializes (disables/enables) VLAN
stripping for the VF based on the device supported capabilities.
This allows us to remove the call to
ice_vsi_manage_vlan_stripping() in ice_vc_process_vlan_msg().
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Currently if the host disables VLAN offloads on the VF by
not setting the VIRTCHNL_VF_OFFLOAD_VLAN capability bit
we will still honor VF VLAN configuration messages over
VIRTCHNL. These messages (i.e. enable/disable VLAN stripping
and VLAN filtering) should be blocked when the feature
is not supported. Fix that by adding a helper function to
determine if the VF is allowed to do VLAN operations based
on the host's VF configuration.
Also, mirror the VF communicated capabilities in the host's
VF configuration.
Signed-off-by: Brett Creeley <brett.creeley@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Firmware always returns 8 as the max number of supported TCs. However on
devices with more than 4 ports, the maximum number of TCs per port is
limited to 4. Check and, if necessary, correct the reporting of
capabilities for devices with more than 4 ports.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Store the number of functions the device has and use this number when
setting safe mode capabilities instead of calculating it.
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Co-developed-by: Kevin Scott <kevin.c.scott@intel.com>
Signed-off-by: Kevin Scott <kevin.c.scott@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Fix following sparse warnings:
drivers/net/dsa/ocelot/felix.c:351:6: warning: symbol 'felix_txtstamp' was not declared. Should it be static?
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Call napi_disable() twice would cause dead lock. There are three situations
may result in the issue.
1. rtl8152_pre_reset() and set_carrier() are run at the same time.
2. Call rtl8152_set_tunable() after rtl8152_close().
3. Call rtl8152_set_ringparam() after rtl8152_close().
For #1, use the same solution as commit 8481141246 ("r8152: Re-order
napi_disable in rtl8152_close"). For #2 and #3, add checking the flag
of IFF_UP and using napi_disable/napi_enable during mutex.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Merge misc fixes from Andrew Morton:
"Three fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
mm/ksm.c: don't WARN if page is still mapped in remove_stable_node()
mm/memory_hotplug: don't access uninitialized memmaps in shrink_zone_span()
Revert "fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()"
End.DT6 behavior makes use of seg6_lookup_nexthop() function which drops
all packets that are destined to be locally processed. However, DT* should
be able to deliver decapsulated packets that are destined to local
addresses. Function seg6_lookup_nexthop() is also used by DX6, so in order
to maintain compatibility I created another routing helper function which
is called seg6_lookup_any_nexthop(). This function is able to take into
account both packets that have to be processed locally and the ones that
are destined to be forwarded directly to another machine. Hence,
seg6_lookup_any_nexthop() is used in DT6 rather than seg6_lookup_nexthop()
to allow local delivery.
Signed-off-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
In commit a82055af59 ("netfilter: nft_payload: add VLAN offload
support"), VLAN fields in struct flow_dissector_key_vlan were unionized
with the intention of introducing another field that covered the whole TCI
header. However without a wrapping struct the subfields end up sharing the
same bits. As a result, "tc filter add ... flower vlan_id 14" specifies not
only vlan_id, but also vlan_priority.
Fix by wrapping the individual VLAN fields in a struct.
Fixes: a82055af59 ("netfilter: nft_payload: add VLAN offload support")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
iQFHBAABCgAxFiEEmvEkXzgOfc881GuFWsYho5HknSAFAl3X9E0THG1rbEBwZW5n
dXRyb25peC5kZQAKCRBaxiGjkeSdIIeQB/9EctdOAnM39+4vy22n9veOj/qEWW1S
VbIgG2WG3eHRPhCVoA7d61QS3/Mdeme0HU6icUXG3AYomCEEDu/Rc0kaFL8G3Zj5
mfL7ypIrjwjC2aEvaRzWVk4+0RW2xAchWONLycaC3ubolmr2q04ETDoY/dmWgByG
UpkhcCuy95kxD5e10do0g8UsP8xJwTEgRD91uPk2/M/f90LTJ1EVCh4nsntoYUDj
PmhavZNoiRBQTjajaHzE6jBQp6kJFvImcbMpn1mFQOuqSnk/b0hWSkgtP40Mp6xD
1j9M2JuttE5ai0eWh8xk0m3/mbCuCeTvJyW4F+c2OQ/aPzl1FqNkrTqh
=9oQw
-----END PGP SIGNATURE-----
Merge tag 'linux-can-fixes-for-5.4-20191122' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2019-11-22
this is a pull request of 2 patches for net/master, if possible for the
current release cycle. Otherwise these patches should hit v5.4 via the
stable tree.
Both patches of this pull request target the m_can driver. Pankaj Sharma
fixes the fallout in the m_can_platform part, which appeared with the
introduction of the m_can platform framework.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
patchset from Kan Yan (Google) and Toke Høiland-Jørgensen (Redhat).
The effect is intended to eventually be similar to BQL, but byte
queue limits are not useful in wifi where the actual throughput can
vary by around 4 orders of magnitude. There are more details in the
patches themselves.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEH1e1rEeCd0AIMq6MB8qZga/fl8QFAl3X3AcACgkQB8qZga/f
l8Q8WQ/+M+KaxTsqlLCZFoQwegQ3Z2i6wZw0uhPEJ3vDWBdOEtopMzv0v69DQPV4
TQdXj+SoXLijvcUah6nc8Ve8am7wjoxf6YfHKvhbJK3xc3L25H+W5+0dZSzWXX1l
ldhv4tBF5nBJAAhAN6DX8oOp6B6t7E5vTbwTcW6fr897g/ypXqM5zl39PQwOCznA
SwRoQua5Wz/EIIpljK9Z9PSv/B2FIa3k6QgZGJizSKZd+wjiYJC0CM1hYbWqZlSx
TL5Zy5QbJhsC7jpByVfJ/SrWuKT5uHVobhUY7uEpLTV2VuMTUSvshY0Naz/uD48+
E6rLkJWD/DiZijCnRuJyh7uFfoWsHOjav69vqzYwTYrtqGBoDbQ3jtYyyePyp1c4
h182yh7IcE7t8CSpgOGPDvYC3o4JYHZhXjyonXS5es4IOrTLLf26HOotvjuPCS4U
KdrDuv/ayYW4C5suBs/E/TIfqCEW+glhJuoEL3ruFXVtvpjLfaAbFsP2OH7M3Vg+
PPOKGtgz0JkdanNuH2aEcEI6UrtHYnAwqpD8DXi2zxk7eKc/yWW8A/guPFVzNsH9
QSucdLMWccfEgQhnHilelEfGPamNGeANQs0uDsdTE9kJ9y9OofgncYsfMb9R5R3p
ezFuWhPtX4DS13lvXLPxl8l6xmz/NKWSwWSqlIlm8u5xi9oyOss=
=0uzN
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-net-next-2019-11-22' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
The interesting new thing here is AQL, the Airtime Queue Limit
patchset from Kan Yan (Google) and Toke Høiland-Jørgensen (Redhat).
The effect is intended to eventually be similar to BQL, but byte
queue limits are not useful in wifi where the actual throughput can
vary by around 4 orders of magnitude. There are more details in the
patches themselves.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Since I'm actively working on vsock and virtio/vhost transports,
Stefan suggested to help him to maintain it.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It is observed that TIPC service binding order will not be kept in the
publication event report to user if the service is subscribed after the
bindings.
For example, services are bound by application in the following order:
Server: bound port A to {18888,66,66} scope 2
Server: bound port A to {18888,33,33} scope 2
Now, if a client subscribes to the service range (e.g. {18888, 0-100}),
it will get the 'TIPC_PUBLISHED' events in that binding order only when
the subscription is started before the bindings.
Otherwise, if started after the bindings, the events will arrive in the
opposite order:
Client: received event for published {18888,33,33}
Client: received event for published {18888,66,66}
For the latter case, it is clear that the bindings have existed in the
name table already, so when reported, the events' order will follow the
order of the rbtree binding nodes (- a node with lesser 'lower'/'upper'
range value will be first).
This is correct as we provide the tracking on a specific service status
(available or not), not the relationship between multiple services.
However, some users expect to see the same order of arriving events
irrespective of when the subscription is issued. This turns out to be
easy to fix. We now add functionality to ensure that publication events
always are issued in the same temporal order as the corresponding
bindings were performed.
v2: replace the unnecessary macro - 'publication_after()' with inline
function.
v3: reuse 'time_after32()' instead of reinventing the same exact code.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
When setting up a cluster with non-replicast/replicast capability
supported. This capability will be disabled for broadcast send link
in order to be backwards compatible.
However, when these non-support nodes left and be removed out the cluster.
We don't update this capability on broadcast send link. Then, some of
features that based on this capability will also disabling as unexpected.
In this commit, we make sure the broadcast send link capabilities will
be re-calculated as soon as a node removed/rejoined a cluster.
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Once udp stack has set the UDP_SKB_IS_STATELESS flag, later skb free
assumes all skb head state has been dropped already.
This will leak the extension memory in case the skb has extensions other
than the ipsec secpath, e.g. bridge nf data.
To fix this, set the UDP_SKB_IS_STATELESS flag only if we don't have
extensions or if the extension space can be free'd.
Fixes: 895b5c9f20 ("netfilter: drop bridge nf reset from nf_reset")
Cc: Paolo Abeni <pabeni@redhat.com>
Reported-by: Byron Stanoszek <gandalf@winds.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix problems with switching cpufreq drivers on some x86 systems with
ACPI (and with changing the operation modes of the intel_pstate driver
on those systems) introduced by recent changes related to the
management of frequency limits in cpufreq.
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAl3X1LQSHHJqd0Byand5
c29ja2kubmV0AAoJEILEb/54YlRxAgMQAKFu4e3IZ7vVMo3gu61r9amqVG03prMh
Ca8lvMd0pxp1wtL6gSkAB0Kces1hwooz4bpA/Khm7igBfdilcNu9gv/d9EvGRCU1
/iOWHrsYS4gQnWwjJ001x+PokQQEFv+uGw9a6vwlEVU4a6TImlJKF+99Beogzr7U
tuTzEpim2ruMHFaP5EqjC/XcU+uzyW2rMnc7fH0t55W0saIT2trI6Nj+DJgZWPth
eRbgS4aOy3Bt9foBCVOFtkXz5VvHS3EUMPI7DxOx3OTt8D469NBDYx/ui8y7frIF
t3vME/UuWCJmEUJ7o0CqhVcB4TW5gZaIE7P2TO9pSCxPxKtPbROdfrR3snGPLjNl
syD7yqwuYpDbfi+0v3Mi3fnUqnnsiXPg2mFCl/w0tgIKz1iAv20pOWnCqja4QreG
Z+Ax1owzrEYlDaV5lDXYZd48dqItj4dMuUBSwXu8RBk+LE/Bjro0jnnK4zUD/M1m
sukrvgFrIXpTgqSnm8sX4fMyZnEvmNtG2kAYl6F84k2EICbOLkXD9ypBANb/9C3h
KcSzJFr76de18s5bnMBgoQRvfElbOotr63TyzeRo/6mF0G4+BpyAEsP7FzRSq+6u
cd+lrCvItzwIgD/5XU1WpkoOoYM//c0Y9JUHKwtnd+FSp1psGJwlhgbbckazQ699
A5DC7maAkkHb
=S23p
-----END PGP SIGNATURE-----
Merge tag 'pm-5.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management regression fix from Rafael Wysocki:
"Fix problems with switching cpufreq drivers on some x86 systems with
ACPI (and with changing the operation modes of the intel_pstate driver
on those systems) introduced by recent changes related to the
management of frequency limits in cpufreq"
* tag 'pm-5.4-final' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: QoS: Invalidate frequency QoS requests after removal
amdgpu:
- Remove experimental flag for navi14
- Fix confusing power message failures on older VI parts
- Hang fix for gfxoff when using the read register interface
- Two stability regression fixes for Raven
i915:
- Fix kernel oops on dumb_create ioctl on no crtc situation
- Fix bad ugly colored flash on VLV/CHV related to gamma LUT update
- Fix unity of the frequencies reported on PMU
- Fix kernel oops on set_page_dirty using better locks around it
- Protect the request pointer with RCU to prevent it being freed while we might need still
- Make pool objects read-only
- Restore physical addresses for fb_map to avoid corrupted page table
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJd1y5MAAoJEAx081l5xIa+HMUP/3G2QKUMN6hqno7EZQjJPhWs
UFaVHhwGtWdwyntwUtSxkZEvj1+1InSw2x/vhBdBghFZ4j5yO63v1pe5EkvAZ4Ff
Yg2z0vhl0tqQ7w4upe5C8TYztBvGqInMdbt76fIAB8jXM10lpPDZ98eVmVTotvCE
6KHzP7/xW4l53SRnptMM+pHrRlaoc4NpLkTPexpJ7woZv07ktG5mGw1/+kEuHi5q
qrg9PKOg3z3Q77yq4dxcH3ET+IL9I34A5WktTXCd1xIowdfzkC7rpr5DVHaVjUqf
Io0Z+ca54enUtfLs5S0pbgmLll9SQ8GMSP1BuwLmZx5Xxqm0TM4paQ0lZV2mVa7X
izWYni+eIIINMqeIfo8xtegZ2YUAz2unx+hy4HK5VGgnTalqidsgfQ+sFVScPsva
3DvgA6Hrz4AiYCen1s01DMuJLi/O7BmxwLwrKgpS3XWYVZKpmLKZqdZpkwLBQC7T
OvluGCLyFsaZj/FAht6l4AqotcWURHd/ujOmK5AxEEelTT1fMAy1KjyiwouWTEIw
7HCRd2yGkqUt3ShKABwTEh0+m4kub+Id3CaXCcu2Ce+bmSoseq564XPXrRzK7eUX
KuuTuj9opXvhHLBsVVGkHs72xWNgc7MoVORWxfDmSGRrD5a93tbltlNmuy+PCAKQ
FY3YQmvWYcLenZAgJ4VX
=0Z0x
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2019-11-22' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Two sets of fixes in here, one for amdgpu, and one for i915.
The amdgpu ones are pretty small, i915's CI system seems to have a few
problems in the last week or so, there is one major regression fix for
fb_mmap, but there are a bunch of other issues fixed in there as well,
oops, screen flashes and rcu related.
amdgpu:
- Remove experimental flag for navi14
- Fix confusing power message failures on older VI parts
- Hang fix for gfxoff when using the read register interface
- Two stability regression fixes for Raven
i915:
- Fix kernel oops on dumb_create ioctl on no crtc situation
- Fix bad ugly colored flash on VLV/CHV related to gamma LUT update
- Fix unity of the frequencies reported on PMU
- Fix kernel oops on set_page_dirty using better locks around it
- Protect the request pointer with RCU to prevent it being freed
while we might need still
- Make pool objects read-only
- Restore physical addresses for fb_map to avoid corrupted page
table"
* tag 'drm-fixes-2019-11-22' of git://anongit.freedesktop.org/drm/drm:
drm/i915/fbdev: Restore physical addresses for fb_mmap()
Revert "drm/amd/display: enable S/G for RAVEN chip"
drm/amdgpu: disable gfxoff on original raven
drm/amdgpu: disable gfxoff when using register read interface
drm/amd/powerplay: correct fine grained dpm force level setting
drm/amd/powerplay: issue no PPSMC_MSG_GetCurrPkgPwr on unsupported ASICs
drm/amdgpu: remove experimental flag for Navi14
drm/i915: make pool objects read-only
drm/i915: Protect request peeking with RCU
drm/i915/userptr: Try to acquire the page lock around set_page_dirty()
drm/i915/pmu: "Frequency" is reported as accumulated cycles
drm/i915: Preload LUTs if the hw isn't currently using them
drm/i915: Don't oops in dumb_create ioctl if we have no crtcs
It's possible to hit the WARN_ON_ONCE(page_mapped(page)) in
remove_stable_node() when it races with __mmput() and squeezes in
between ksm_exit() and exit_mmap().
WARNING: CPU: 0 PID: 3295 at mm/ksm.c:888 remove_stable_node+0x10c/0x150
Call Trace:
remove_all_stable_nodes+0x12b/0x330
run_store+0x4ef/0x7b0
kernfs_fop_write+0x200/0x420
vfs_write+0x154/0x450
ksys_write+0xf9/0x1d0
do_syscall_64+0x99/0x510
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Remove the warning as there is nothing scary going on.
Link: http://lkml.kernel.org/r/20191119131850.5675-1-aryabinin@virtuozzo.com
Fixes: cbf86cfe04 ("ksm: remove old stable nodes more thoroughly")
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Let's limit shrinking to !ZONE_DEVICE so we can fix the current code.
We should never try to touch the memmap of offline sections where we
could have uninitialized memmaps and could trigger BUGs when calling
page_to_nid() on poisoned pages.
There is no reliable way to distinguish an uninitialized memmap from an
initialized memmap that belongs to ZONE_DEVICE, as we don't have
anything like SECTION_IS_ONLINE we can use similar to
pfn_to_online_section() for !ZONE_DEVICE memory.
E.g., set_zone_contiguous() similarly relies on pfn_to_online_section()
and will therefore never set a ZONE_DEVICE zone consecutive. Stopping
to shrink the ZONE_DEVICE therefore results in no observable changes,
besides /proc/zoneinfo indicating different boundaries - something we
can totally live with.
Before commit d0dc12e86b ("mm/memory_hotplug: optimize memory
hotplug"), the memmap was initialized with 0 and the node with the right
value. So the zone might be wrong but not garbage. After that commit,
both the zone and the node will be garbage when touching uninitialized
memmaps.
Toshiki reported a BUG (race between delayed initialization of
ZONE_DEVICE memmaps without holding the memory hotplug lock and
concurrent zone shrinking).
https://lkml.org/lkml/2019/11/14/1040
"Iteration of create and destroy namespace causes the panic as below:
kernel BUG at mm/page_alloc.c:535!
CPU: 7 PID: 2766 Comm: ndctl Not tainted 5.4.0-rc4 #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:set_pfnblock_flags_mask+0x95/0xf0
Call Trace:
memmap_init_zone_device+0x165/0x17c
memremap_pages+0x4c1/0x540
devm_memremap_pages+0x1d/0x60
pmem_attach_disk+0x16b/0x600 [nd_pmem]
nvdimm_bus_probe+0x69/0x1c0
really_probe+0x1c2/0x3e0
driver_probe_device+0xb4/0x100
device_driver_attach+0x4f/0x60
bind_store+0xc9/0x110
kernfs_fop_write+0x116/0x190
vfs_write+0xa5/0x1a0
ksys_write+0x59/0xd0
do_syscall_64+0x5b/0x180
entry_SYSCALL_64_after_hwframe+0x44/0xa9
While creating a namespace and initializing memmap, if you destroy the
namespace and shrink the zone, it will initialize the memmap outside
the zone and trigger VM_BUG_ON_PAGE(!zone_spans_pfn(page_zone(page),
pfn), page) in set_pfnblock_flags_mask()."
This BUG is also mitigated by this commit, where we for now stop to
shrink the ZONE_DEVICE zone until we can do it in a safe and clean way.
Link: http://lkml.kernel.org/r/20191006085646.5768-5-david@redhat.com
Fixes: f1dd2cd13c ("mm, memory_hotplug: do not associate hotadded memory to zones until online") [visible after d0dc12e86b]
Signed-off-by: David Hildenbrand <david@redhat.com>
Reported-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reported-by: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Damian Tometzki <damian.tometzki@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Halil Pasic <pasic@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jun Yao <yaojun8558363@gmail.com>
Cc: Logan Gunthorpe <logang@deltatee.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Cc: Pankaj Gupta <pagupta@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qian Cai <cai@lca.pw>
Cc: Rich Felker <dalias@libc.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Yu Zhao <yuzhao@google.com>
Cc: <stable@vger.kernel.org> [4.13+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This reverts commit 56e94ea132.
Commit 56e94ea132 ("fs: ocfs2: fix possible null-pointer dereferences
in ocfs2_xa_prepare_entry()") introduces a regression that fail to
create directory with mount option user_xattr and acl. Actually the
reported NULL pointer dereference case can be correctly handled by
loc->xl_ops->xlo_add_entry(), so revert it.
Link: http://lkml.kernel.org/r/1573624916-83825-1-git-send-email-joseph.qi@linux.alibaba.com
Fixes: 56e94ea132 ("fs: ocfs2: fix possible null-pointer dereferences in ocfs2_xa_prepare_entry()")
Signed-off-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Reported-by: Thomas Voegtle <tv@lio96.de>
Acked-by: Changwei Ge <gechangwei@live.cn>
Cc: Jia-Ju Bai <baijiaju1990@gmail.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The function m_can_runtime_resume() is getting recursively called from
m_can_class_resume(). This results in a lock up.
We need not call m_can_class_resume() during m_can_runtime_resume().
Fixes: f524f829b7 ("can: m_can: Create a m_can platform framework")
Signed-off-by: Pankaj Sharma <pankj.sharma@samsung.com>
Signed-off-by: Sriram Dash <sriram.dash@samsung.com>
Acked-by: Dan Murphy <dmurphy@ti.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
The previous commit added the ability to throttle stations when they queue
too much airtime in the hardware. This commit enables the functionality by
calculating the expected airtime usage of each packet that is dequeued from
the TXQs in mac80211, and accounting that as pending airtime.
The estimated airtime for each skb is stored in the tx_info, so we can
subtract the same amount from the running total when the skb is freed or
recycled. The throttling mechanism relies on this accounting to be
accurate (i.e., that we are not freeing skbs without subtracting any
airtime they were accounted for), so we put the subtraction into
ieee80211_report_used_skb(). As an optimisation, we also subtract the
airtime on regular TX completion, zeroing out the value stored in the
packet afterwards, to avoid having to do an expensive lookup of the station
from the packet data on every packet.
This patch does *not* include any mechanism to wake a throttled TXQ again,
on the assumption that this will happen anyway as a side effect of whatever
freed the skb (most commonly a TX completion).
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20191119060610.76681-5-kyan@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
In order for the Fq_CoDel algorithm integrated in mac80211 layer to operate
effectively to control excessive queueing latency, the CoDel algorithm
requires an accurate measure of how long packets stays in the queue, AKA
sojourn time. The sojourn time measured at the mac80211 layer doesn't
include queueing latency in the lower layer (firmware/hardware) and CoDel
expects lower layer to have a short queue. However, most 802.11ac chipsets
offload tasks such TX aggregation to firmware or hardware, thus have a deep
lower layer queue.
Without a mechanism to control the lower layer queue size, packets only
stay in mac80211 layer transiently before being sent to firmware queue.
As a result, the sojourn time measured by CoDel in the mac80211 layer is
almost always lower than the CoDel latency target, hence CoDel does little
to control the latency, even when the lower layer queue causes excessive
latency.
The Byte Queue Limits (BQL) mechanism is commonly used to address the
similar issue with wired network interface. However, this method cannot be
applied directly to the wireless network interface. "Bytes" is not a
suitable measure of queue depth in the wireless network, as the data rate
can vary dramatically from station to station in the same network, from a
few Mbps to over Gbps.
This patch implements an Airtime-based Queue Limit (AQL) to make CoDel work
effectively with wireless drivers that utilized firmware/hardware
offloading. AQL allows each txq to release just enough packets to the lower
layer to form 1-2 large aggregations to keep hardware fully utilized and
retains the rest of the frames in mac80211 layer to be controlled by the
CoDel algorithm.
Signed-off-by: Kan Yan <kyan@google.com>
[ Toke: Keep API to set pending airtime internal, fix nits in commit msg ]
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/r/20191119060610.76681-4-kyan@google.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>