Add watchdog for rt2800 devices. For now it only detect hung
and print error.
[Note: I verified that printing messages from process context is
fine on MT7620 (WT3020) platform that have problem when printk
is called from interrupt context].
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
For mmio we do not properlly trace dma done Q_INDEX_DMA_DONE index
for TX queues. That would require implementing INT_SOURCE_CSR_*_DMA_DONE
interrupts, what is rather not worth to do due to adding extra
CPU load (small but still somewhat not necessary otherwise).
We can just read TX DMA done indexes from registers directly. What
will be used by watchdog.
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Allow subdriver to change watchdog interval by intialize
link->watchdog_interval value before rt2x00link_register().
Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This patch removes the error branch for (queue > dev->queues).
It is no longer needed anymore as the "queue" value is validated by
cfg80211's parse_txq_params() before the driver code gets called.
Signed-off-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
rtw_writ16_set should be rtw_write16_set
Signed-off-by: Tzu-En Huang <tehuang@realtek.com>
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The hardware default uses management queue to transmit frames that are
downloaded into reserved page, so we need to clearly assign the frames
to use qsel in TX_DESC_QSEL_MGMT to avoid using wrong queue.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
DACK is done right after the hardware has been turned on, which
means it will be done every time we leave the IDLE state.
But it takes ~2 seconds to finish DACK.
We can back up the results and restore them. And it only takes a few
milliseconds to restore the results to the hardware, saving a lot of
time.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
We could fail to power on because it was already on. If the return
value is -EALREADY, power off and then power on again to turn on the
hardware as expected.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
8822c used to count OFDM FA count by subtracting tx count from FA count.
But it need to substract more counters to be accurate.
However, we can count it by adding up all of the FA counters we want.
And it is simpler to add than list all of the components to substract.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Driver Could fail to reset counter if rx clock gating is not disabled.
So we need to disable rx clock gating before resetting counters.
Otherwise counters may increase unexpected.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In 2G channels, the cck source and rxagc should be set to different
values based on different bandwidth to increase the performance of rx
sensitivity.
To improve rx throughput performance, the values of sbd subtune and
pt_opt should be changed in different bandwidth.
Signed-off-by: Chien-Hsun Liao <ben.liao@realtek.com>
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Collision of writing rf registers could occur if the driver writes
rf registers by direct write while the hardware is writing other rf
registers by pi write simultaneously.
Hardware pi write can be triggered by rf calibrations sometimes, so
the driver can not always write rf registers by direct write
protection. Direct write protection can make sure that there is no
hardware pi write during the direct write.
According to some experiments, if we add direct write protection
when switching channel, the performance of rf calibration will not
be affected.
Signed-off-by: Chien-Hsun Liao <ben.liao@realtek.com>
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Add beacon function setting routines for each hardware port.
If beacon function is not enabled, the hardware is not able
to synchronize with AP's beacon and can miss the beacons
under some scenarios such as PS mode.
For AP and Adhoc modes that require to send beacons, do not
update the TSF, otherwise the beacon interval may be affected.
Signed-off-by: Chin-Yen Lee <timlee@realtek.com>
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
When driver uses random mac address to scan, the unicast probe response
will not be received because the addr1 is not matched. Configure port
address by requested mac address to receive probe response from AP.
To support random mac scan, we need to configure the mac address during
scan period to receive unicast prop_resp. After scan is completed,
configure the mac address back to the original one that the port used.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
With dynamic power save support, rtw88 is able to support fast tx
path, claim it to mac80211.
Signed-off-by: Yan-Hsuan Chuang <yhchuang@realtek.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
When calling debugfs functions, there is no need to ever check the
return value. This driver was saving the debugfs file away to be
removed at a later time. However, the 80211 core would delete the whole
directory that the debugfs files are created in, after it asks the
driver to do the deletion, so just rely on the 80211 core to do all of
the cleanup for us, making us not need to keep a pointer to the dentries
around at all.
This cleans up the structure of the driver data a bit and makes the code
a tiny bit smaller.
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
When calling debugfs functions, there is no need to ever check the
return value. This driver was saving the debugfs file away to be
removed at a later time. However, the 80211 core would delete the whole
directory that the debugfs files are created in, after it asks the
driver to do the deletion, so just rely on the 80211 core to do all of
the cleanup for us, making us not need to keep a pointer to the dentries
around at all.
This cleans up the structure of the driver data a bit and makes the code
a tiny bit smaller.
Cc: Stanislaw Gruszka <sgruszka@redhat.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The WARN_ON triggers on older BCM4401-B0 100Base-TX ethernet controllers.
The warning serves no purpose. So let's just remove it.
Reported-by: H Buus <ubuntu@hbuus.com>
Signed-off-by: Michael Büsch <m@bues.ch>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Variable rtstatus is being initialized with a value that is never read
as rtstatus is being re-assigned a little later on. The assignment is
redundant and hence can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array. For example:
struct ieee80211_regdomain {
...
struct ieee80211_reg_rule reg_rules[];
};
instance = kzalloc(sizeof(*mac->rd) +
sizeof(struct ieee80211_reg_rule) *
count, GFP_KERNEL);
Instead of leaving these open-coded and prone to type mistakes, we can
now use the new struct_size() helper:
instance = kzalloc(struct_size(instance, reg_rules, count), GFP_KERNEL);
This code was detected with the help of Coccinelle.
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: Sergey Matyukevich <sergey.matyukevich.os@quantenna.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The assignment of 0 to variable k is never read once we break out of
the loop, so the assignment is redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The variable badworden is assigned with a value that is never read and
it is re-assigned a new value immediately afterwards. The assignment is
redundant and can be removed.
Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Larry Finger <Larry.Finger@lwfinger.net>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Firmware can send invalid command response, the processing of
which can attempt to modify unexpected context and cause issues.
To fix this, driver should check that the command response ID is
same as the one it downloaded, and ignore processing of invalid
response.
Signed-off-by: Swati Kushwaha <swatiuma@marvell.com>
Signed-off-by: Ganapathi Bhat <gbhat@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
In set_mac_address, driver check for interfaces with same bss_type
For first STA entry, this would return 3 interfaces since all priv's have
bss_type as 0 due to kzalloc. Thus mac address gets changed for STA
unexpected. This patch adds check for first STA and avoids mac address
change. This patch also adds mac_address change for p2p based on bss_num
type.
Signed-off-by: Sharvari Harisangam <sharvari@marvell.com>
Signed-off-by: Ganapathi Bhat <gbhat@marvell.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Unadorned '%p' has restrictive policies these days, such that it usually
just prints garbage at early boot (see
Documentation/core-api/printk-formats.rst, "kernel will print
``(ptrval)`` until it gathers enough entropy"). Annotating with %pK
(for "kernel pointer") allows the kptr_restrict sysctl to control
printing policy better.
We might just as well drop this message entirely, but this fix was easy
enough for now.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
This message is pointless.
While we're at it, include the error code in the error message, which is
not pointless.
Signed-off-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
The syzbot fuzzer found a bug in the p54 USB wireless driver. The
issue involves a race between disconnect and the firmware-loader
callback routine, and it has several aspects.
One big problem is that when the firmware can't be loaded, the
callback routine tries to unbind the driver from the USB _device_ (by
calling device_release_driver) instead of from the USB _interface_ to
which it is actually bound (by calling usb_driver_release_interface).
The race involves access to the private data structure. The driver's
disconnect handler waits for a completion that is signalled by the
firmware-loader callback routine. As soon as the completion is
signalled, you have to assume that the private data structure may have
been deallocated by the disconnect handler -- even if the firmware was
loaded without errors. However, the callback routine does access the
private data several times after that point.
Another problem is that, in order to ensure that the USB device
structure hasn't been freed when the callback routine runs, the driver
takes a reference to it. This isn't good enough any more, because now
that the callback routine calls usb_driver_release_interface, it has
to ensure that the interface structure hasn't been freed.
Finally, the driver takes an unnecessary reference to the USB device
structure in the probe function and drops the reference in the
disconnect handler. This extra reference doesn't accomplish anything,
because the USB core already guarantees that a device structure won't
be deallocated while a driver is still bound to any of its interfaces.
To fix these problems, this patch makes the following changes:
Call usb_driver_release_interface() rather than
device_release_driver().
Don't signal the completion until after the important
information has been copied out of the private data structure,
and don't refer to the private data at all thereafter.
Lock udev (the interface's parent) before unbinding the driver
instead of locking udev->parent.
During the firmware loading process, take a reference to the
USB interface instead of the USB device.
Don't take an unnecessary reference to the device during probe
(and then don't drop it during disconnect).
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Reported-and-tested-by: syzbot+200d4bb11b23d929335f@syzkaller.appspotmail.com
CC: <stable@vger.kernel.org>
Acked-by: Christian Lamparter <chunkeey@gmail.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Resolve conflict between d2912cb15b ("treewide: Replace GPLv2
boilerplate/reference with SPDX - rule 500") removing the GPL disclaimer
and fe03d47456 ("Update my email address") which updates Jozsef
Kadlecsik's email.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Raju Rangoju says:
====================
cxgb4: Reference count MPS TCAM entries within a PF
Firmware reference counts the MPS TCAM entries by PF and VF,
but it does not do it for usage within a PF or VF. This patch
adds the support to track MPS TCAM entries within a PF.
v2->v3:
Fixed the compiler errors due to incorrect patch
Also, removed the new blank line at EOF
v1->v2:
Use refcount_t type instead of atomic_t for mps reference count
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds reference counting support for
alloc/free mac filters
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds TCAM reference counting
support for cxgb4 change mac path
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds TCAM reference counting
support for raw mac filters.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove existing mps refcounting code which was
added only for encap filters and add necessary
data structures/functions to support mps reference
counting for all the mac filters. Also add wrapper
functions for allocating and freeing encap mac
filters.
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PHY selection bit also exists on SoCs without an internal PHY; if it's
set to 1 (internal PHY, default value) then the MAC will not make use of
any PHY on such SoCs.
This problem appears when adapting for H6, which has no real internal PHY
(the "internal PHY" on H6 is not on-die, but on a co-packaged AC200 chip,
connected via RMII interface at GPIO bank A).
Force the PHY selection bit to 0 when the SOC doesn't have an internal PHY,
to address the problem of a wrong default value.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Ondrej Jirman <megous@megous.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The EMAC on Allwinner H6 is just like the one on A64. The "internal PHY" on
H6 is on a co-packaged AC200 chip, and it's not really internal (it's
connected via RMII at PA GPIO bank).
Add support for the Allwinner H6 EMAC in the dwmac-sun8i driver.
Signed-off-by: Icenowy Zheng <icenowy@aosc.io>
Signed-off-by: Ondrej Jirman <megous@megous.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Stefano Brivio says:
====================
Fix listing (IPv4, IPv6) and flushing (IPv6) of cached route exceptions
For IPv6 cached routes, the commands 'ip -6 route list cache' and
'ip -6 route flush cache' don't work at all after route exceptions have
been moved to a separate hash table in commit 2b760fcf5c ("ipv6: hook
up exception table to store dst cache").
For IPv4 cached routes, the command 'ip route list cache' has also
stopped working in kernel 3.5 after commit 4895c771c7 ("ipv4: Add FIB
nexthop exceptions.") introduced storage for route exceptions as a
separate entity.
Fix this by allowing userspace to clearly request cached routes with
the RTM_F_CLONED flag used as a filter (in conjuction with strict
checking) and by retrieving and dumping cached routes if requested.
If strict checking is not requested (iproute2 < 5.0.0), we don't have a
way to consistently filter results on other selectors (e.g. on tables),
so skip filtering entirely and dump both regular routes and exceptions.
For IPv4, cache flushing uses a completely different mechanism, so it
wasn't affected. Listing of exception routes (modified routes pre-3.5) was
tested against these versions of kernel and iproute2:
iproute2
kernel 4.14.0 4.15.0 4.19.0 5.0.0 5.1.0
3.5-rc4 + + + + +
4.4
4.9
4.14
4.15
4.19
5.0
5.1
fixed + + + + +
For IPv6, a separate iproute2 patch is required. Versions of iproute2
and kernel tested:
iproute2
kernel 4.14.0 4.15.0 4.19.0 5.0.0 5.1.0 5.1.0, patched
3.18 list + + + + + +
flush + + + + + +
4.4 list + + + + + +
flush + + + + + +
4.9 list + + + + + +
flush + + + + + +
4.14 list + + + + + +
flush + + + + + +
4.15 list
flush
4.19 list
flush
5.0 list
flush
5.1 list
flush
with list + + + + + +
fix flush + + + +
v7: Make sure r->rtm_tos is initialised in 3/11, move loop over nexthop
objects in 4/11, add comments about usage of "skip" counters in commit
messages of 4/11 and 8/11
v6: Target for net-next, rebase and adapt to nexthop objects for IPv6 paths.
Merge selftests into this series (as they were addressed for net-next).
A number of minor changes detailed in logs of single patches.
v5: Skip filtering altogether if no strict checking is requested: selecting
routes or exceptions only would be inconsistent with the fact we can't
filter on tables. Drop 1/8 (non-strict dump filter function no longer
needed), replace 2/8 (don't use NLM_F_MATCH, decide to skip routes or
exceptions in filter function), drop 6/8 (2/8 is enough for IPv6 too).
Introduce dump_routes and dump_exceptions flags in filter, adapt other
patches to that.
v4: Fix the listing issue also for IPv4, making the behaviour consistent
with IPv6. Honour NLM_F_MATCH as per RFC 3549 and allow usage of
RTM_F_CLONED filter. Split patches into smaller logical changes.
v3: Drop check on RTM_F_CLONED and rework logic of return values of
rt6_dump_route()
v2: Add count of routes handled in partial dumps, and skip them, in patch 1/2.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Instead of just listing and flushing two cached exceptions, create
a relatively big number of them, and count how many are listed. Single
netlink dump messages contain approximately 25 entries each, and this
way we can make sure the partial dump tracking mechanism is working
properly.
While at it, also ensure that no cached routes can be listed after
flush, and remove 'sleep 1' calls, they are not actually needed.
v7: No changes
v6:
- Merge this patch into series including fix, as it's also targeted
for net-next. No actual changes
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This test checks that route exceptions can be successfully listed and
flushed using ip -6 route {list,flush} cache.
v7: No changes
v6:
- Merge this patch into series including fix, as it's also targeted
for net-next
- Drop left-over print of 'ip route list cache | wc -l'
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When we perform an inexact match on FIB nodes via fib6_locate_1(), longer
prefixes will be preferred to shorter ones. However, it might happen that
a node, with higher fn_bit value than some other, has no valid routing
information.
In this case, we'll pick that node, but it will be discarded by the check
on RTN_RTINFO in fib6_locate(), and we might miss nodes with valid routing
information but with lower fn_bit value.
This is apparent when a routing exception is created for a default route:
# ip -6 route list
fc00:1::/64 dev veth_A-R1 proto kernel metric 256 pref medium
fc00:2::/64 dev veth_A-R2 proto kernel metric 256 pref medium
fc00:4::1 via fc00:2::2 dev veth_A-R2 metric 1024 pref medium
fe80::/64 dev veth_A-R1 proto kernel metric 256 pref medium
fe80::/64 dev veth_A-R2 proto kernel metric 256 pref medium
default via fc00:1::2 dev veth_A-R1 metric 1024 pref medium
# ip -6 route list cache
fc00:4::1 via fc00:2::2 dev veth_A-R2 metric 1024 expires 593sec mtu 1500 pref medium
fc00:3::1 via fc00:1::2 dev veth_A-R1 metric 1024 expires 593sec mtu 1500 pref medium
# ip -6 route flush cache # node for default route is discarded
Failed to send flush request: No such process
# ip -6 route list cache
fc00:3::1 via fc00:1::2 dev veth_A-R1 metric 1024 expires 586sec mtu 1500 pref medium
Check right away if the node has a RTN_RTINFO flag, before replacing the
'prev' pointer, that indicates the longest matching prefix found so far.
Fixes: 38fbeeeecc ("ipv6: prepare fib6_locate() for exception table")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 2b760fcf5c ("ipv6: hook up exception table to store dst
cache"), route exceptions reside in a separate hash table, and won't be
found by walking the FIB, so they won't be dumped to userspace on a
RTM_GETROUTE message.
This causes 'ip -6 route list cache' and 'ip -6 route flush cache' to
have no function anymore:
# ip -6 route get fc00:3::1
fc00:3::1 via fc00:1::2 dev veth_A-R1 src fc00:1::1 metric 1024 expires 539sec mtu 1400 pref medium
# ip -6 route get fc00:4::1
fc00:4::1 via fc00:2::2 dev veth_A-R2 src fc00:2::1 metric 1024 expires 536sec mtu 1500 pref medium
# ip -6 route list cache
# ip -6 route flush cache
# ip -6 route get fc00:3::1
fc00:3::1 via fc00:1::2 dev veth_A-R1 src fc00:1::1 metric 1024 expires 520sec mtu 1400 pref medium
# ip -6 route get fc00:4::1
fc00:4::1 via fc00:2::2 dev veth_A-R2 src fc00:2::1 metric 1024 expires 519sec mtu 1500 pref medium
because iproute2 lists cached routes using RTM_GETROUTE, and flushes them
by listing all the routes, and deleting them with RTM_DELROUTE one by one.
If cached routes are requested using the RTM_F_CLONED flag together with
strict checking, or if no strict checking is requested (and hence we can't
consistently apply filters), look up exceptions in the hash table
associated with the current fib6_info in rt6_dump_route(), and, if present
and not expired, add them to the dump.
We might be unable to dump all the entries for a given node in a single
message, so keep track of how many entries were handled for the current
node in fib6_walker, and skip that amount in case we start from the same
partially dumped node.
When a partial dump restarts, as the starting node might change when
'sernum' changes, we have no guarantee that we need to skip the same
amount of in-node entries. Therefore, we need two counters, and we need to
zero the in-node counter if the node from which the dump is resumed
differs.
Note that, with the current version of iproute2, this only fixes the
'ip -6 route list cache': on a flush command, iproute2 doesn't pass
RTM_F_CLONED and, due to this inconsistency, 'ip -6 route flush cache' is
still unable to fetch the routes to be flushed. This will be addressed in
a patch for iproute2.
To flush cached routes, a procfs entry could be introduced instead: that's
how it works for IPv4. We already have a rt6_flush_exception() function
ready to be wired to it. However, this would not solve the issue for
listing.
Versions of iproute2 and kernel tested:
iproute2
kernel 4.14.0 4.15.0 4.19.0 5.0.0 5.1.0 5.1.0, patched
3.18 list + + + + + +
flush + + + + + +
4.4 list + + + + + +
flush + + + + + +
4.9 list + + + + + +
flush + + + + + +
4.14 list + + + + + +
flush + + + + + +
4.15 list
flush
4.19 list
flush
5.0 list
flush
5.1 list
flush
with list + + + + + +
fix flush + + + +
v7:
- Explain usage of "skip" counters in commit message (suggested by
David Ahern)
v6:
- Rebase onto net-next, use recently introduced nexthop walker
- Make rt6_nh_dump_exceptions() a separate function (suggested by David
Ahern)
v5:
- Use dump_routes and dump_exceptions from filter, ignore NLM_F_MATCH,
update test results (flushing works with iproute2 < 5.0.0 now)
v4:
- Split NLM_F_MATCH and strict check handling in separate patches
- Filter routes using RTM_F_CLONED: if it's not set, only return
non-cached routes, and if it's set, only return cached routes:
change requested by David Ahern and Martin Lau. This implies that
iproute2 needs a separate patch to be able to flush IPv6 cached
routes. This is not ideal because we can't fix the breakage caused
by 2b760fcf5c entirely in kernel. However, two years have passed
since then, and this makes it more tolerable
v3:
- More descriptive comment about expired exceptions in rt6_dump_route()
- Swap return values of rt6_dump_route() (suggested by Martin Lau)
- Don't zero skip_in_node in case we don't dump anything in a given pass
(also suggested by Martin Lau)
- Remove check on RTM_F_CLONED altogether: in the current UAPI semantic,
it's just a flag to indicate the route was cloned, not to filter on
routes
v2: Add tracking of number of entries to be skipped in current node after
a partial dump. As we restart from the same node, if not all the
exceptions for a given node fit in a single message, the dump will
not terminate, as suggested by Martin Lau. This is a concrete
possibility, setting up a big number of exceptions for the same route
actually causes the issue, suggested by David Ahern.
Reported-by: Jianlin Shi <jishi@redhat.com>
Fixes: 2b760fcf5c ("ipv6: hook up exception table to store dst cache")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the next patch, we are going to add optional dump of exceptions to
rt6_dump_route().
Change the return code of rt6_dump_route() to accomodate partial node
dumps: we might dump multiple routes per node, and might be able to dump
only a given number of them, so fib6_dump_node() will need to know how
many routes have been dumped on partial dump, to restart the dump from the
point where it was interrupted.
Note that fib6_dump_node() is the only caller and already handles all
non-negative return codes as success: those become -1 to signal that we're
done with the node. If we fail, return 0, as we were unable to dump the
single route in the node, but we're not done with it.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If fc_nh_id isn't set, we shouldn't try to match against it. This
actually matters just for the RTF_CACHE below (where this case is
already handled): if iproute2 gets a route exception and tries to
delete it, it won't reference it by fc_nh_id, even if a nexthop
object might be associated to the originating route.
Fixes: 5b98324ebe ("ipv6: Allow routes to use nexthop objects")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 08e814c9e8: as we
are preparing to fix listing and dumping of IPv6 cached routes, we
need to allow RTM_F_CLONED as a flag to match routes against while
dumping them.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 4895c771c7 ("ipv4: Add FIB nexthop exceptions."), cached
exception routes are stored as a separate entity, so they are not dumped
on a FIB dump, even if the RTM_F_CLONED flag is passed.
This implies that the command 'ip route list cache' doesn't return any
result anymore.
If the RTM_F_CLONED is passed, and strict checking requested, retrieve
nexthop exception routes and dump them. If no strict checking is
requested, filtering can't be performed consistently: dump everything in
that case.
With this, we need to add an argument to the netlink callback in order to
track how many entries were already dumped for the last leaf included in
a partial netlink dump.
A single additional argument is sufficient, even if we traverse logically
nested structures (nexthop objects, hash table buckets, bucket chains): it
doesn't matter if we stop in the middle of any of those, because they are
always traversed the same way. As an example, s_i values in [], s_fa
values in ():
node (fa) #1 [1]
nexthop #1
bucket #1 -> #0 in chain (1)
bucket #2 -> #0 in chain (2) -> #1 in chain (3) -> #2 in chain (4)
bucket #3 -> #0 in chain (5) -> #1 in chain (6)
nexthop #2
bucket #1 -> #0 in chain (7) -> #1 in chain (8)
bucket #2 -> #0 in chain (9)
--
node (fa) #2 [2]
nexthop #1
bucket #1 -> #0 in chain (1) -> #1 in chain (2)
bucket #2 -> #0 in chain (3)
it doesn't matter if we stop at (3), (4), (7) for "node #1", or at (2)
for "node #2": walking flattens all that.
It would even be possible to drop the distinction between the in-tree
(s_i) and in-node (s_fa) counter, but a further improvement might
advise against this. This is only as accurate as the existing tracking
mechanism for leaves: if a partial dump is restarted after exceptions
are removed or expired, we might skip some non-dumped entries.
To improve this, we could attach a 'sernum' attribute (similar to the
one used for IPv6) to nexthop entities, and bump this counter whenever
exceptions change: having a distinction between the two counters would
make this more convenient.
Listing of exception routes (modified routes pre-3.5) was tested against
these versions of kernel and iproute2:
iproute2
kernel 4.14.0 4.15.0 4.19.0 5.0.0 5.1.0
3.5-rc4 + + + + +
4.4
4.9
4.14
4.15
4.19
5.0
5.1
fixed + + + + +
v7:
- Move loop over nexthop objects to route.c, and pass struct fib_info
and table ID to it, not a struct fib_alias (suggested by David Ahern)
- While at it, note that the NULL check on fa->fa_info is redundant,
and the check on RTNH_F_DEAD is also not consistent with what's done
with regular route listing: just keep it for nhc_flags
- Rename entry point function for dumping exceptions to
fib_dump_info_fnhe(), and rearrange arguments for consistency with
fib_dump_info()
- Rename fnhe_dump_buckets() to fnhe_dump_bucket() and make it handle
one bucket at a time
- Expand commit message to describe why we can have a single "skip"
counter for all exceptions stored in bucket chains in nexthop objects
(suggested by David Ahern)
v6:
- Rebased onto net-next
- Loop over nexthop paths too. Move loop over fnhe buckets to route.c,
avoids need to export rt_fill_info() and to touch exceptions from
fib_trie.c. Pass NULL as flow to rt_fill_info(), it now allows that
(suggested by David Ahern)
Fixes: 4895c771c7 ("ipv4: Add FIB nexthop exceptions.")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In the next patch, we're going to use rt_fill_info() to dump exception
routes upon RTM_GETROUTE with NLM_F_ROOT, meaning userspace is requesting
a dump and not a specific route selection, which in turn implies the input
interface is not relevant. Update rt_fill_info() to handle a NULL
flowinfo.
v7: If fl4 is NULL, explicitly set r->rtm_tos to 0: it's not initialised
otherwise (spotted by David Ahern)
v6: New patch
Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This functionally reverts the check introduced by commit
e8ba330ac0 ("rtnetlink: Update fib dumps for strict data checking")
as modified by commit e4e92fb160 ("net/ipv4: Bail early if user only
wants prefix entries").
As we are preparing to fix listing of IPv4 cached routes, we need to
give userspace a way to request them.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The following patches add back the ability to dump IPv4 and IPv6 exception
routes, and we need to allow selection of regular routes or exceptions.
Use RTM_F_CLONED as filter to decide whether to dump routes or exceptions:
iproute2 passes it in dump requests (except for IPv6 cache flush requests,
this will be fixed in iproute2) and this used to work as long as
exceptions were stored directly in the FIB, for both IPv4 and IPv6.
Caveat: if strict checking is not requested (that is, if the dump request
doesn't go through ip_valid_fib_dump_req()), we can't filter on protocol,
tables or route types.
In this case, filtering on RTM_F_CLONED would be inconsistent: we would
fix 'ip route list cache' by returning exception routes and at the same
time introduce another bug in case another selector is present, e.g. on
'ip route list cache table main' we would return all exception routes,
without filtering on tables.
Keep this consistent by applying no filters at all, and dumping both
routes and exceptions, if strict checking is not requested. iproute2
currently filters results anyway, and no unwanted results will be
presented to the user. The kernel will just dump more data than needed.
v7: No changes
v6: Rebase onto net-next, no changes
v5: New patch: add dump_routes and dump_exceptions flags in filter and
simply clear the unwanted one if strict checking is enabled, don't
ignore NLM_F_MATCH and don't set filter_set if NLM_F_MATCH is set.
Skip filtering altogether if no strict checking is requested:
selecting routes or exceptions only would be inconsistent with the
fact we can't filter on tables.
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>