This adds support for the PTP_SYS_OFFSET_EXTENDED ioctl.
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a driver provides gettimex64(), use it in the PTP_SYS_OFFSET ioctl
and POSIX clock's gettime() instead of gettime64(). Drivers should
provide only one of the functions.
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PTP_SYS_OFFSET ioctl, which can be used to measure the offset
between a PHC and the system clock, includes the total time that the
driver needs to read the PHC timestamp.
This typically involves reading of multiple PCI registers (sometimes in
multiple iterations) and the register that contains the lowest bits of
the timestamp is not read in the middle between the two readings of the
system clock. This asymmetry causes the measured offset to have a
significant error.
Introduce a new ioctl, driver function, and helper functions, which
allow the reading of the lowest register to be isolated from the other
readings in order to reduce the asymmetry. The ioctl returns three
timestamps for each measurement:
- system time right before reading the lowest bits of the PHC timestamp
- PHC time
- system time immediately after reading the lowest bits of the PHC
timestamp
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If a gettime64 call fails, return the error and avoid copying data back
to user.
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Reorder declarations of variables as reversed Christmas tree.
Cc: Richard Cochran <richardcochran@gmail.com>
Suggested-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Huazhong Tan says:
====================
hns3: add code optimization for VF reset and some new reset feature
Currently hardware supports below reset:
1. VF reset: triggered by sending cmd to IMP(Integrated Management
Processor). Only reset specific VF function and do not affect
other PF or VF.
2. PF reset: triggered by sending cmd to IMP. Only reset specific PF
and it's VF.
3. PF FLR: triggered by PCIe subsystem. Only reset specific PF and
it's VF.
4. VF FLR: triggered by PCIe subsystem. Only reset specific VF function
and do not affect other PF or VF.
5. Core reset: triggered by writing to register. Reset most hardware
unit, such as SSU, which affects all the PF and VF.
6. Global reset: triggered by writing to register. Reset all hardware
unit, which affects all the PF and VF.
7. IMP reset: triggered by IMU(Intelligent Management Unit) when
IMP is not longer feeding IMU's watchdog. IMU will reload the IMP
firmware and IMP will perform global reset after firmware reloading,
which affects all the PF and VF.
Current driver only support PF/VF reset, incomplete core and global
reset(lacking the vf reset handling). So this patchset adds complete
reset support in hns3 driver.
Also, this patchset contains some optimization related to reset.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements the .reset_prepare and .reset_done
ops from pci framework to support the VF FLR.
This patch uses hclgevf_set_def_reset_request() and
hclgevf_reset_event() to handle FLR, so when
hdev->default_reset_request is non zero, it means there is
some reset requseted by hclgevf_set_def_reset_request() need
to be processed. Also get the hdev from the ae_dev because
hclgevf_reset_event is called with handle being NULL.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While doing PF FLR, VF's PCIe configuration space will be cleared, so
the pci and vector of VF should be re-initialized in the VF's reset
process while PF doing FLR.
Also, this patch fixes some memory not freed problem when pci
re-initialization is done during reset process.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch implements the .reset_prepare and .reset_done
ops from pci framework to support the PF FLR.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current code only print the prompt message after receiving
the IMP reset interrupt and does not perform the corresponding driver
reset operation. This patch implements the missing IMP reset handling
in the driver.
1. The driver sets the HCLGE_STATE_CMD_DISABLE to stop sending command
after receiving the IMP reset interrupt.
2. The driver needs to notify the hardware to reload the IMP firmware.
3. The IMP firmware reloading makes the reset time of hardware longer,
so it is necessary to extend the driver's waiting time to wait for
the hardware reset to complete.
4. In hclge_check_event_cause, IMP reset event should have higher
priority than other events.
Also, after clearing HCLGE_STATE_CMD_DISABLE in the hclge_cmd_init(),
it needs to check whether there is a pending reset, if so, just set
the HCLGE_STATE_CMD_DISABLE back and return.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When calling napi_disable during reset down process, if NAPIF_STATE_MISSED
is set, napi_complete will call __napi_schedule to do the polling again.
So this patch uses HNS3_NIC_STATE_DOWN to ensure the polling is not
scheduled again.
Also, when napi_complete returns true, it means polling is scheduled
again, it is not neccssary to enable the interrupt.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since hclgevf_reset() may fail for some reasons, so it needs an error
handler to deal with it. When VF reset failed, VF can only be restored
by a higher level reset asserted by PF. So, it needs to reinitialize
its command queue, then it can respond to higher level reset.
Also, this patch adds error logging in the hclgevf_notify_client().
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to hardware's description, after the reset occurs, the driver
needs to re-initialize the command queue before sending and receiving
any commands. Therefore, the VF's driver needs to identify the command
queue needs to re-initialize with HCLGEVF_STATE_CMD_DISABLE, and does
not allow sending or receiving commands before the re-initialization.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a Core/Global/IMP reset occurs, the hardware sets the reset status
register of all PF/VF and reports a reset interrupt to all PF/VF and
firmware.
When receiving the reset interrupt:
1. The firmware will wait for 100 ms before resetting the hardware and
clear the reset status register of all PF when hardware reset is done.
2. The PF/VF driver needs to down the netdev within 100 ms and then wait
for hardware reset to finish.
3. After firmware clearing the reset status register of all PF, the PF
driver reinitializes the hardware and clear the reset status register
of it's VF.
4. After PF driver clearing the reset status register of VF, the VF driver
reinitializes the hardware.
This patch mainly add handling for the step 4.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When PF performs a function reset, the hardware will reset both PF
and all the VF belong to this PF. Hence, both PF's driver and VF's
driver need to perform corresponding reset operations.
Before PF driver asserting function reset to hardware, it firstly
set up VF's hardware reset status, and inform the VF driver with
HNAE3_VF_PF_FUNC_RESET, then VF driver sets this reset type to
reset_pending and shechule reset task to stop IO and waits for the
hardware reset status to clear. When PF driver has reinitialized the
hardware and is ready to process mailbox from VF, PF driver clears
VF's hardware reset status for VF to continue its reset process.
Also, this patch uses readl_poll_timeout to simplify the hardware reset
status waitting.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently when VF need to reset itself, it will send a cmd to PF,
after receiving the VF reset requset, PF sends a cmd to inform
VF to enter the reset process and send a cmd to firmware to do the
actual reset for the VF, it is possible that firmware has resetted
the VF, but VF has not entered the reset process, which may cause
IO not stopped problem when firmware is resetting VF.
This patch fixes it by adjusting the VF reset process, when VF
need to reset itself, it will enter the reset process first, and
it will tell the PF to send cmd to firmware to reset itself.
Add member reset_pending to struct hclgevf_dev, which indicates that
there is reset event need to be processed by the VF's reset task, and
the VF's reset task chooses the highest-level one and clears other
low-level one when it processes reset_pending.
hclge_inform_reset_assert_to_vf function is unused now, but it will
be used to support the PF reset with VF working, so declare it in
the header file.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When doing reset, the reset handling function only need to
reinitialize hardware, it makes sense to add a function to
do that job. Also the error handling of hclgevf_init_hdev is
different when it is used in reset process.
This patch adds reset_hdev to reinitialize hardware when resetting.
Also, this patch removes the hclgevf_dev_ongoing_full_reset because
it is unused now.
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The locally maintained list for tracking hash mac table was
not freed during driver remove.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
mac_hlist was initialized during adapter_up, which will be called
every time a vf device is first brought up, or every time when device
is brought up again after bringing all devices down. This means our
state of previous list is lost, causing a memleak if entries are
present in the list. To fix that, move list init to the condition
that performs initial one time adapter setup.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The locally maintained list for tracking hash mac table was
not freed during driver remove.
Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
if skb is NULL pointer, and the following access of skb's
skb_mstamp_ns will trigger panic, which is same as BUG_ON
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
RING_PUSH_REQUESTS_AND_CHECK_NOTIFY is already able to make sure backend sees
requests before req_prod is updated.
Signed-off-by: Jacob Wen <jian.w.wen@oracle.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jakub Kicinski says:
====================
nfp: abm: move code and improve parameter validation
This set starts by separating Qdisc handling code into a new file.
Next two patches allow early access to TLV-based capabilities during
probe, previously the capabilities were parsed just before netdevs
were registered, but its cleaner to do some basic validation earlier
and avoid cleanup work.
Next three patches improve RED's parameter validation. First we provide
a more precise message about why offload failed (and move the parameter
validation to a helper). Next we make sure we don't set the top bit
in the 32 bit max RED threshold value. Because FW is treating the value
as signed it reportedly causes slow downs (unnecessary queuing and
marking) when top bit is set with recent firmwares. Last (and perhaps
least importantly) we offload the harddrop parameter of the Qdisc.
We don't plan to offload harddrop RED, but it seems prudent to make
sure user didn't set that flag as device behaviour would have differed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
RED Qdisc will now inform the drivers about the state of the harddrop
flag. Refuse to offload in case harddrop is set.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To mirror software behaviour on offload more precisely inform
the drivers about the state of the harddrop flag.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Turns out the threshold value is used in signed compares in the FW,
so we should avoid setting the top bit.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Improve log messages printed when RED can't be offloaded because
of Qdisc parameters.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In certain cases initialization logic which follows allocation of
the vNIC structure may want to validate the capabilities of that vNIC.
This is easy before vNIC is initialized for normal capabilities which
are at fixed offsets in control memory, easy to locate and read, but
poses a challenge if the capabilities are in form of TLVs. Parse
the TLVs early on so other code can just access parsed info, instead
of having to do the parsing by itself.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move setting ctrl_bar pointer to the nfp_net_alloc function,
to make sure we can parse capabilities early in the following
patch.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The Qdisc offload code is logically separate, and we will soon
do significant surgery on it to support more Qdiscs, so move
it to a separate file.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Recently, in commit ab408b6dc7 ("tcp: switch tcp and sch_fq to new
earliest departure time model"), the TCP BBR code switched to a new
approach of using an explicit bbr_pacing_margin_percent for shaving a
pacing rate "haircut", rather than the previous implict
approach. Update an old comment to reflect the new approach.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław says:
====================
net: Use __vlan_hwaccel_*() helpers
This series removes from networking core and driver code an assumption
about how VLAN tag presence is stored in an skb. This will allow to free
up overloading of VLAN.CFI bit to incidate tag's presence.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes assumption than vlan_tci != 0 when tag is present.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
This removes assumptions about VLAN_TAG_PRESENT bit.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use __vlan_hwaccel_put_tag() to set vlan tag and proto fields.
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>
__skb_checksum_complete_head() and __skb_checksum_complete()
are both declared in skbuff.h, they fit better in skbuff.c
than datagram.c.
Cc: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The vlan configuration is not restored after interface donw/up sequence
(if dual-emac - both interfaces). Tested on am572x EVM.
Steps to check:
~# ip link add link eth1 name eth1.100 type vlan id 100
~# ifconfig eth0 down
~# ifconfig eth1 down
Try to remove vid and observe warning:
~# ip link del eth1.100
[ 739.526757] net eth1: removing vlanid 100 from vlan filter
[ 739.533322] failed to kill vid 0081/100 for device eth1
This patch fixes it, restoring only vlan ALE entries and all other
unicast/multicast entries are restored by system calling rx_mode ndo.
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
At this moment, mcast addresses are added for real device only
(reserved vlans for dual-emac mode), even if a mcast address was added
for some vlan only, thus ALE doesn't have corresponding vlan mcast
entries after vlan socket joined multicast group. So ALE drops vlan
frames with mcast addresses intended for vlans and potentially can
receive mcast frames for base ndev. That's not correct. So, fix it by
creating only vlan/mcast entries as requested. Patch doesn't use any
additional lists and is based on device mc address list and cpsw ALE
table entries.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It's redundancy for the drivers to hold the list of vlans when
absolutely the same list exists in vlan core. In most cases it's
needed only to traverse the vlan devices, their vids and sync some
settings with h/w, so add API to simplify this.
At least some of these drivers also can benefit:
grep "for_each.*vid" -r drivers/net/ethernet/
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c:
drivers/net/ethernet/synopsys/dwc-xlgmac-hw.c:
drivers/net/ethernet/qlogic/qlge/qlge_main.c:
drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c:
drivers/net/ethernet/via/via-rhine.c:
drivers/net/ethernet/via/via-velocity.c:
drivers/net/ethernet/intel/igb/igb_main.c:
drivers/net/ethernet/intel/ice/ice_main.c:
drivers/net/ethernet/intel/e1000/e1000_main.c:
drivers/net/ethernet/intel/i40e/i40e_main.c:
drivers/net/ethernet/intel/e1000e/netdev.c:
drivers/net/ethernet/intel/igbvf/netdev.c:
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:
drivers/net/ethernet/intel/ixgb/ixgb_main.c:
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:
drivers/net/ethernet/amd/xgbe/xgbe-dev.c:
drivers/net/ethernet/emulex/benet/be_main.c:
drivers/net/ethernet/neterion/vxge/vxge-main.c:
drivers/net/ethernet/adaptec/starfire.c:
drivers/net/ethernet/brocade/bna/bnad.c:
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
In order to avoid all table update, and only remove or add new
address, the auxiliary function exists, named __hw_addr_sync_dev().
It allows end driver do nothing when nothing changed and add/rm when
concrete address is firstly added or lastly removed. But it doesn't
include cases when an address of real device or vlan was reused by
other vlans or vlan/macval devices.
For handaling events when address was reused/unreused the patch adds
new auxiliary routine - __hw_addr_ref_sync_dev(). It allows to do
nothing when nothing was changed and do updates only for an address
being added/reused/deleted/unreused. Thus, clone address changes for
vlans can be mirrored in the table. The function is exclusive with
__hw_addr_sync_dev(). It's responsibility of the end driver to
identify address vlan device, if it needs so.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
As added in 3e59020abf ("net: bql: add __netdev_tx_sent_queue()"), which
see for performance rationale.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michał Mirosław says:
====================
net: Remove VLAN_TAG_PRESENT from drivers
This series removes VLAN_TAG_PRESENT use from network drivers in
preparation to removing its special meaning.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Signed-off-by: David S. Miller <davem@davemloft.net>