linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-02-25 05:31:13 +07:00

Author	SHA1	Message	Date
Lorenzo Bianconi	3d687a7fcb	mt76: mt7615: add mt7615_mac_wtbl_addr routine Introduce mt7615_mac_wtbl_addr rouinte to compute sta wtbl address. This is a preliminary patch to update wtbl key directly from host processor Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:30 +02:00
Lorenzo Bianconi	92671eb95c	mt76: mt7615: move mt7615_mac_get_key_info in mac.c This is a preliminary patch to update wtbl key directly from host processor Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:30 +02:00
Felix Fietkau	880495e2f0	mt76: mt7615: add missing register initialization - initialize CCA signal source - initialize clock for band 1 (7615D) - initialize BAR rate Reviewed-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:30 +02:00
Lorenzo Bianconi	27c7bfc5f0	mt76: mt76x0u: add support to TP-Link T2UHP Introduce support to TP-Link T2UHP https://wikidevi.com/wiki/TP-LINK_Archer_T2UHP Tested-by: Sid Hayn <sidhayn@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Stanislaw Gruszka	3d1e5cddae	mt76: mt7615: use params->ssn value directly There is no point to use pointer to params->ssn. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Stanislaw Gruszka	f8f3b20a9a	mt76: mt7603: use params->ssn value directly There is no point to use pointer to params->ssn. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Stanislaw Gruszka	5eedd2a5c9	mt76: mt76x02: use params->ssn value directly There is no point to use pointer to params->ssn. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Stanislaw Gruszka	8f72e98e9c	mt76: usb: remove unneeded {put,get}_unaligned Compiler give us guarantees on variables alignment, so use an variable as buffer when read/write registers and remove unneeded {put,get}_unaligned. Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Stanislaw Gruszka	b229bf7d30	mt76: usb: fix endian in mt76u_copy In contrast to mt76_wr() which we use to program registers, on mt76_wr_copy() we should not change endian of the data. Fixes: `b40b15e152` ("mt76: add usb support to mt76 layer") Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Felix Fietkau	820e4da174	mt76: mt7603: fix invalid fallback rates Only decrement the rate index on duplicate rates if it is not already 0 Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Felix Fietkau	f4635f66da	mt76: mt7615: fix invalid fallback rates Only decrement the rate index on duplicate rates if it is not already 0 Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Felix Fietkau	1f5581dffe	mt76: mt7615: fix PS buffering of action frames Bufferable management frames need to be put in the data queue, otherwise they will not be buffered when the receiver is asleep. Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Felix Fietkau	3eb514dd45	mt76: mt7615: fix using VHT STBC rates The hardware expects MT_TX_RATE_NSS to be filled with the number of space/time streams. For non-STBC rates, this is equal to nss. For 1-stream STBC, this needs to be set to 2. This is relevant for VHT rates only, on HT, the value is derived from MCS internally. Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Lorenzo Bianconi	cf21105198	mt76: mt76u: fix typo in mt76u_fill_rx_sg Fix typo setting urb->transfer_buffer_length in mt76u_fill_rx_sg Fixes: `b40b15e152` ("mt76: add usb support to mt76 layer") Fixes: `f8f527b16d` ("mt76: usb: use EP max packet aligned buffer sizes for rx") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Felix Fietkau	4af81f02b4	mt76: mt7615: sync with mt7603 rate control changes - Store the previous and current rate set in the driver + the TSF value at the time of the switch. - Use the tx status TSF value to determine which rate set needs to be used as reference. - Report only short or long GI rates for a single status event, not a mix. - The hardware reports the last used rate index. Use it along with the retry count to figure out what rate was used for the first attempt. - Use the same retry count value for all rate slots to make this calculation work. - Derive the probe rate from the current rateset instead of the skb cb - Do not wait for a status report for the probe frame before removing the probe rate from the rate table. Do it immediately after it was referenced in a tx status report. - Use the first half of the first rate retry budget for the probe rate in order to avoid using too many retries on that rate - Switch from lower rates to higher rates more conservatively - enable hardware rate up/down selection Reviewed-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:29 +02:00
Felix Fietkau	5f3413fc5e	mt76: mt7615: reset rate index/counters on rate table update These values must be initialized to zero, otherwise the hardware could reuse previous values, especially the rate index Reviewed-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Felix Fietkau	592ed85d6b	mt76: mt7615: move mt7615_mcu_set_rates to mac.c It bypasses the MCU, so it does not belong in mcu.c Also make mt7615_mac_tx_rate_val static Reviewed-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Felix Fietkau	3815ab3f49	mt76: mt7603: enable hardware rate up/down selection Improves performance by switching away from bad rates faster Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Lorenzo Bianconi	6c6a3fe6f9	mt76: mt7615: introduce mt7615_mcu_send_ram_firmware routine Add mt7615_mcu_send_ram_firmware routine since mt7615_load_ram runs the same code to send ram firmware to cr4 and n9 mcus. Moreover rename gen_dl_mode in mt7615_mcu_gen_dl_mode. This patch does not introduce any behaviour change, it is just code refactor. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Lorenzo Bianconi	2fc446487c	mt76: mt7615: always release sem in mt7615_load_patch Release patch semaphore even if request_firmware fails in mt7615_load_patch Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Lorenzo Bianconi	4a926e3022	mt76: mt7615: fall back to sw encryption for unsupported ciphers Fix following warning falling back to sw encryption for unsupported ciphers WARNING: CPU: 2 PID: 1495 at backports-4.19.32-1/net/mac80211/key.c:1023 mt76_wcid_key_setup+0x68/0xbc [mt76] CPU: 2 PID: 1495 Comm: hostapd Not tainted 4.14.131 #0 Stack : 00000000 8f0f8bc0 00000000 8007ccec 805f0000 8058ec18 00000000 00000000 80559788 8dca79bc 8fefb10c 805c89c7 805545c8 00000001 8dca7960 53261662 00000000 00000000 80640000 00004668 00000000 000000e9 00000007 00000000 00000000 805d0000 00072537 00000000 80000000 00000000 805f0000 8f1e70d0 8e8fa098 000003ff 805c0000 8f0f8bc0 00000001 802d4340 00000008 80630008 [<800108d0>] show_stack+0x58/0x100 [<8049214c>] dump_stack+0x9c/0xe0 [<80033998>] __warn+0xe0/0x138 [<80033a80>] warn_slowpath_null+0x1c/0x2c [<8e8fa098>] mt76_wcid_key_setup+0x68/0xbc [mt76] [<8e889930>] mt7615_eeprom_init+0x7c0/0xe14 [mt7615e] Suggested-by: Sebastian Gottschall <s.gottschall@newmedia-net.de> Signed-off-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Felix Fietkau	5abe8baf10	mt76: mt7615: clean up FWDL TXQ during/after firmware upload Since we don't clean that tx queue from the tx tasklet, we need to do it after the firmware upload is done. This patch also adds a cleanup step during the upload, to help reclaim memory faster. Fixes unprocessed queued frames eating up memory long after the firmware upload has already completed Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Lorenzo Bianconi	70911d9638	mt76: mt7615: add radar pattern test knob to debugfs Introduce mt7615_mcu_rdd_send_pattern routine to trigger a radar pattern detection. Moreover move debugfs related routines in a dedicated source file. Suggested-by: Ryder Lee <ryder.lee@mediatek.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Lorenzo Bianconi	5ec87dc8c3	mt76: mt7615: add csa support Add Channel Switch Announcement support to mt7615 driver updating beacon template with CSA IE received from mac80211 Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Lorenzo Bianconi	02fc62e374	mt76: mt7615: do not perform txcalibration before cac is complited Delay channel calibration after Channel Availability Check. Add some code cleanup to mt7615_mcu_set_channel Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Lorenzo Bianconi	d67a66469f	mt76: mt7615: add hw dfs pattern detector support Add hw radar detection support to mt7615 driver in order to unlock dfs channels on 5GHz band Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Lorenzo Bianconi	3ea8370537	mt76: mt7615: introduce mt7615_regd_notifier Introduce mt7615_regd_notifier callback. This is a preliminary patch to add radar detection support to mt7615 driver Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Lorenzo Bianconi	132d8da5bd	mt76: mt7615: fix sparse warnings: warning: restricted __le16 degrades to integer Fix the following sparse warning in __mt7615_mcu_msg_send: drivers/net/wireless/mediatek/mt76/mt7615/mcu.c:78:15: sparse: warning: restricted __le16 degrades to integer drivers/net/wireless/mediatek/mt76/mt7615/mcu.c:78:15: sparse: warning: cast from restricted __le16 Fixes: `04b8e65922` ("mt76: add mac80211 driver for MT7615 PCIe-based chipsets") Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:28 +02:00
Felix Fietkau	850e8f6fbd	mt76: round up length on mt76_wr_copy When beacon length is not a multiple of 4, the beacon could be sent with the last 1-3 bytes corrupted. The skb data is guaranteed to have enough room for reading beyond the end, because it is always followed by skb_shared_info, so rounding up is safe. All other callers of mt76_wr_copy have multiple-of-4 length already. Cc: stable@vger.kernel.org Signed-off-by: Felix Fietkau <nbd@nbd.name>	2019-09-05 17:42:27 +02:00
Anirudh Venkataramanan	5c875c1af8	ice: Rework around device/function capabilities ice_parse_caps is printing capabilities in a different way when compared to the variable names. This makes it difficult to search for the right strings in the debug logs. So this patch updates the print strings to be exactly the same as the fields' name in the structure. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Jesse Brandeburg	dd47e1fd86	ice: change default number of receive descriptors The driver should start out with a reasonable number of descriptors that can prevent drops due to a CPU being in a power management state. Change the default number of descriptors to 2048. The user can always change the value at runtime. Transmit descriptor counts are not modified because they don't need to change due to the speed of the interface, or for power managed CPUs, but the code is simplified to a fixed value for the transmit default. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Anirudh Venkataramanan	8c243700ab	ice: Minor refactor in queue management Remove q_left_tx and q_left_rx from the PF struct as these can be obtained by calling ice_get_avail_txq_count and ice_get_avail_rxq_count respectively. The function ice_determine_q_usage is only setting num_lan_tx and num_lan_rx in the PF structure, and these are later assigned to vsi->alloc_txq and vsi->alloc_rxq respectively. This is an unnecessary indirection, so remove ice_determine_q_usage and just assign values for vsi->alloc_txq and vsi->alloc_rxq in ice_vsi_set_num_qs and use these to set num_lan_tx and num_lan_rx respectively. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Dave Ertman	ea300f41bb	ice: Allow for delayed LLDP MIB change registration Add an additional boolean parameter to the ice_init_dcb function. This boolean controls if the LLDP MIB change events are registered for. Also, add a new function defined ice_cfg_lldp_mib_change. The additional function is necessary to be able to register for LLDP MIB change events after calling ice_init_dcb. The net effect of these two changes is to allow a delayed registration for MIB change events so that the driver is not accepting events before it is ready for them. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Ashish Shah	201beeb715	ice: update Tx context struct Add internal usage flag, bit 91 as described in spec. Update width of internal queue state to 122 also as described in spec. Signed-off-by: Ashish Shah <ashish.n.shah@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Akeem G Abodunrin	dfc6240012	ice: Report VF link status with opcode to get resources This patch changes how and when the driver report link status, instead of waiting till the call to enable queues for VF, we should report link status earlier with opcode to get VF resources - So as to avoid reporting erroneous information, especially when queues have not been configured. In addition, we can also make a call to get and report link status change after when queue is enabled, at least to report netdev or PHY link status. This is in accordance to how link speed is being reported for PF... Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Anirudh Venkataramanan	80739b57b1	ice: Check for DCB capability before initializing DCB Check the ICE_FLAG_DCB_CAPABLE before calling ice_init_pf_dcb. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Lukasz Czapnik	c61d234234	ice: report link down for VF when PF's queues are not enabled This is port of a fix from i40e commit `2ad1274fa3` ("i40e: don't report link up for a VF who hasn't enabled queues") Older VF drivers do not respond well to receiving a link up notification before queues are enabled. This can cause their state machine to think that it is safe to send traffic. This results in a Tx hang on the VF. Record whether the PF has actually enabled queues for the VF. When reporting link status, always report link down if the queues aren't enabled. In this way, the VF driver will never receive a link up notification until after its queues are enabled. Signed-off-by: Lukasz Czapnik <lukasz.czapnik@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:41 -07:00
Mitch Williams	29d42f1f3a	ice: Reliably reset VFs When a PFR (or bigger reset) occurs, the device clears the VF_MBX_ARQLEN register for all VFs. But if a VFR is triggered by a VF, the device does NOT clear this register, and the VF driver will never see the reset. When this happens, the VF driver will eventually timeout and attempt recovery, and usually it will be successful. But this makes resets take a long time and there are occasional failures. We cannot just blithely clear this register on every reset; this has been shown to cause synchronization problems when a PFR is triggered with a large number of VFs. Fix this by clearing VF_MBX_ARQLEN when the reset source is not PFR. GlobR will trigger PFR, so this test catches that occurrence as well. Signed-off-by: Mitch Williams <mitch.a.williams@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Jesse Brandeburg	9d56b7fd6a	ice: change work limit to a constant The driver has supported a transmit work limit that was configurable from ethtool for a long time, but there are no good use cases for having it be a variable that can be changed at run time. In addition, this variable was noted to be causing performance overhead due to cache misses. Just remove the variable and let the code use a constant so that the functionality is maintained (a limit on the number of transmits that will be cleaned in any one call to the clean routines) without the cache miss. Removes code, removes a variable, removes testing surface. Yay. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Jesse Brandeburg	d27525ec1f	ice: small efficiency fixes Add a small bit of efficiency to the code by adding a prefetch of the port_info structure in order to help avoid a cache miss a little later on in execution. Also add an unlikely statement to a branch which generally will never happen in normal operation. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Jesse Brandeburg	6503b65930	ice: move code closer together This is a simple patch to move the assignment to a local variable closer to the site where the local variable is used. This can help readability and also maybe performance, although the performance enhancement is really dependent upon the compiler. No functional change. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Jesse Brandeburg	2fb0821fd5	ice: clean up arguments There are a couple of functions that don't need two arguments passed in when the second argument already had access to the pointer pointed to by the first. Remove the unnecessary arguments. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Anirudh Venkataramanan	ade78c2ec1	ice: Check root pointer for validity ice_sched_get_tc_node uses pi->root without checking for NULL. Add a check to prevent NULL pointer dereference. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Anirudh Venkataramanan	208ff75135	ice: Add ice_get_main_vsi to get PF/main VSI There are multiple places where we currently use ice_find_vsi_by_type to get the PF (a.k.a. main) VSI. The PF VSI by definition is always the first element in the pf->vsi array (i.e. pf->vsi[0]). So instead add and use a new helper function ice_get_main_vsi, which just returns pf->vsi[0]. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Brett Creeley	34cdcb165b	ice: Update fields in ice_vsi_set_num_qs when reconfiguring Currently when vsi->req_txqs or vsi->req_rxqs are set we don't correctly set the number of vsi->num_q_vectors. Fix this by setting the number of queue vectors based on the max between the vsi->alloc_txqs and vsi->alloc_rxqs. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2019-09-05 08:13:40 -07:00
Daniel Borkmann	593f191a80	Merge branch 'bpf-af-xdp-barrier-fixes' Björn Töpel says: ==================== This is a four patch series of various barrier, {READ, WRITE}_ONCE cleanups in the AF_XDP socket code. More details can be found in the corresponding commit message. Previous revisions: v1 [4] and v2 [5]. For an AF_XDP socket, most control plane operations are done under the control mutex (struct xdp_sock, mutex), but there are some places where members of the struct is read outside the control mutex. The dev, queue_id members are set in bind() and cleared at cleanup. The umem, fq, cq, tx, rx, and state member are all assigned in various places, e.g. bind() and setsockopt(). When the members are assigned, they are protected by the control mutex, but since they are read outside the mutex, a WRITE_ONCE is required to avoid store-tearing on the read-side. Prior the state variable was introduced by Ilya, the dev member was used to determine whether the socket was bound or not. However, when dev was read, proper SMP barriers and READ_ONCE were missing. In order to address the missing barriers and READ_ONCE, we start using the state variable as a point of synchronization. The state member read/write is paired with proper SMP barriers, and from this follows that the members described above does not need READ_ONCE statements if used in conjunction with state check. To summarize: The members struct xdp_sock members dev, queue_id, umem, fq, cq, tx, rx, and state were read lock-less, with incorrect barriers and missing {READ, WRITE}_ONCE. After this series umem, fq, cq, tx, rx, and state are read lock-less. When these members are updated, WRITE_ONCE is used. When read, READ_ONCE are only used when read outside the control mutex (e.g. mmap) or, not synchronized with the state member (XSK_BOUND plus smp_rmb()) [1] https://lore.kernel.org/bpf/beef16bb-a09b-40f1-7dd0-c323b4b89b17@iogearbox.net/ [2] https://lwn.net/Articles/793253/ [3] https://github.com/google/ktsan/wiki/READ_ONCE-and-WRITE_ONCE [4] https://lore.kernel.org/bpf/20190822091306.20581-1-bjorn.topel@gmail.com/ [5] https://lore.kernel.org/bpf/20190826061053.15996-1-bjorn.topel@gmail.com/ v2->v3: Minor restructure of commits. Improve cover and commit messages. (Daniel) v1->v2: Removed redundant dev check. (Jonathan) ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-05 14:11:53 +02:00
Björn Töpel	25dc18ff9b	xsk: lock the control mutex in sock_diag interface When accessing the members of an XDP socket, the control mutex should be held. This commit fixes that. Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Fixes: `a36b38aa2a` ("xsk: add sock_diag interface for AF_XDP") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-05 14:11:52 +02:00
Björn Töpel	42fddcc7c6	xsk: use state member for socket synchronization Prior the state variable was introduced by Ilya, the dev member was used to determine whether the socket was bound or not. However, when dev was read, proper SMP barriers and READ_ONCE were missing. In order to address the missing barriers and READ_ONCE, we start using the state variable as a point of synchronization. The state member read/write is paired with proper SMP barriers, and from this follows that the members described above does not need READ_ONCE if used in conjunction with state check. In all syscalls and the xsk_rcv path we check if state is XSK_BOUND. If that is the case we do a SMP read barrier, and this implies that the dev, umem and all rings are correctly setup. Note that no READ_ONCE are needed for these variable if used when state is XSK_BOUND (plus the read barrier). To summarize: The members struct xdp_sock members dev, queue_id, umem, fq, cq, tx, rx, and state were read lock-less, with incorrect barriers and missing {READ, WRITE}_ONCE. Now, umem, fq, cq, tx, rx, and state are read lock-less. When these members are updated, WRITE_ONCE is used. When read, READ_ONCE are only used when read outside the control mutex (e.g. mmap) or, not synchronized with the state member (XSK_BOUND plus smp_rmb()) Note that dev and queue_id do not need a WRITE_ONCE or READ_ONCE, due to the introduce state synchronization (XSK_BOUND plus smp_rmb()). Introducing the state check also fixes a race, found by syzcaller, in xsk_poll() where umem could be accessed when stale. Suggested-by: Hillf Danton <hdanton@sina.com> Reported-by: syzbot+c82697e3043781e08802@syzkaller.appspotmail.com Fixes: `77cd0d7b3f` ("xsk: add support for need_wakeup flag in AF_XDP rings") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-05 14:11:52 +02:00
Björn Töpel	9764f4b301	xsk: avoid store-tearing when assigning umem The umem member of struct xdp_sock is read outside of the control mutex, in the mmap implementation, and needs a WRITE_ONCE to avoid potential store-tearing. Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Fixes: `423f38329d` ("xsk: add umem fill queue support and mmap") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-05 14:11:52 +02:00
Björn Töpel	94a997637c	xsk: avoid store-tearing when assigning queues Use WRITE_ONCE when doing the store of tx, rx, fq, and cq, to avoid potential store-tearing. These members are read outside of the control mutex in the mmap implementation. Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com> Fixes: `37b076933a` ("xsk: add missing write- and data-dependency barrier") Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2019-09-05 14:11:52 +02:00

... 2 3 4 5 6 ...

859224 Commits