linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-26 06:35:04 +07:00

Author	SHA1	Message	Date
Ivan Khoronzhuk	4b4255ed06	net: ethernet: ti: cpsw: restore shaper configuration while down/up Need to restore shapers configuration after interface was down/up. This is needed as appropriate configuration is still replicated in kernel settings. This only shapers context restore, so vlan configuration should be restored by user if needed, especially for devices with one port where vlan frames are sent via ALE. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 20:34:36 -07:00
Ivan Khoronzhuk	57d9014825	net: ethernet: ti: cpsw: add CBS Qdisc offload The cpsw has up to 4 FIFOs per port and upper 3 FIFOs can feed rate limited queue with shaping. In order to set and enable shaping for those 3 FIFOs queues the network device with CBS qdisc attached is needed. The CBS configuration is added for dual-emac/single port mode only, but potentially can be used in switch mode also, based on switchdev for instance. Despite the FIFO shapers can work w/o cpdma level shapers the base usage must be in combine with cpdma level shapers as described in TRM, that are set as maximum rates for interface queues with sysfs. One of the possible configuration with txq shapers and CBS shapers: Configured with echo RATE > /sys/class/net/eth0/queues/tx-0/tx_maxrate /--------------------------------------------------- / / cpdma level shapers +----+ +----+ +----+ +----+ +----+ +----+ +----+ +----+ \| c7 \| \| c6 \| \| c5 \| \| c4 \| \| c3 \| \| c2 \| \| c1 \| \| c0 \| \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \ / \/ \/ \/ \/ \/ \/ \/ \/ +---------\|------\|------\|------\|-------------------------------------+ \| +----+ \| \| +---+ \| \| \| +----+ \| \| \| \| v v v v \| \| +----+ +----+ +----+ +----+ p p+----+ +----+ +----+ +----+ \| \| \| \| \| \| \| \| \| \| o o\| \| \| \| \| \| \| \| \| \| \| f3 \| \| f2 \| \| f1 \| \| f0 \| r CPSW r\| f3 \| \| f2 \| \| f1 \| \| f0 \| \| \| \| \| \| \| \| \| \| \| t t\| \| \| \| \| \| \| \| \| \| \ / \ / \ / \ / 0 1\ / \ / \ / \ / \| \| \ X \ / \ / \ / \ / \ / \ / \ / \| \| \/ \ \/ \/ \/ \/ \/ \/ \/ \| +-------\------------------------------------------------------------+ \ \ FIFO shaper, set with CBS offload added in this patch, \ FIFO0 cannot be rate limited ------------------------------------------------------ CBS shaper configuration is supposed to be used with root MQPRIO Qdisc offload allowing to add sk_prio->tc->txq maps that direct traffic to appropriate tx queue and maps L2 priority to FIFO shaper. The CBS shaper is intended to be used for AVB where L2 priority (pcp field) is used to differentiate class of traffic. So additionally vlan needs to be created with appropriate egress sk_prio->l2 prio map. If CBS has several tx queues assigned to it, the sum of their bandwidth has not overlap bandwidth set for CBS. It's recomended the CBS bandwidth to be a little bit more. The CBS shaper is configured with CBS qdisc offload interface using tc tool from iproute2 packet. For instance: $ tc qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1 $ tc -g class show dev eth0 +---(100:ffe2) mqprio \| +---(100:3) mqprio \| +---(100:4) mqprio \| +---(100:ffe1) mqprio \| +---(100:2) mqprio \| +---(100:ffe0) mqprio +---(100:1) mqprio $ tc qdisc add dev eth0 parent 100:1 cbs locredit -1440 \ hicredit 60 sendslope -960000 idleslope 40000 offload 1 $ tc qdisc add dev eth0 parent 100:2 cbs locredit -1470 \ hicredit 62 sendslope -980000 idleslope 20000 offload 1 The above code set CBS shapers for tc0 and tc1, for that txq0 and txq1 is used. Pay attention, the real set bandwidth can differ a bit due to discreteness of configuration parameters. Here parameters like locredit, hicredit and sendslope are ignored internally and are supposed to be set with assumption that maximum frame size for frame - 1500. It's supposed that interface speed is not changed while reconnection, not always is true, so inform user in case speed of interface was changed, as it can impact on dependent shapers configuration. For more examples see Documentation. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 20:34:36 -07:00
Ivan Khoronzhuk	7929a66871	net: ethernet: ti: cpsw: add MQPRIO Qdisc offload That's possible to offload vlan to tc priority mapping with assumption sk_prio == L2 prio. Example: $ ethtool -L eth0 rx 1 tx 4 $ qdisc replace dev eth0 handle 100: parent root mqprio num_tc 3 \ map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 queues 1@0 1@1 2@2 hw 1 $ tc -g class show dev eth0 +---(100:ffe2) mqprio \| +---(100:3) mqprio \| +---(100:4) mqprio \| +---(100:ffe1) mqprio \| +---(100:2) mqprio \| +---(100:ffe0) mqprio +---(100:1) mqprio Here, 100:1 is txq0, 100:2 is txq1, 100:3 is txq2, 100:4 is txq3 txq0 belongs to tc0, txq1 to tc1, txq2 and txq3 to tc2 The offload part only maps L2 prio to classes of traffic, but not to transmit queues, so to direct traffic to traffic class vlan has to be created with appropriate egress map. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 20:34:36 -07:00
Ivan Khoronzhuk	4bb6c356a0	net: ethernet: ti: cpdma: fit rated channels in backward order According to TRM tx rated channels should be in 7..0 order, so correct it. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 20:34:36 -07:00
Ivan Khoronzhuk	79b3325d0d	net: ethernet: ti: cpsw: use cpdma channels in backward order for txq The cpdma channel highest priority is from hi to lo number. The driver has limited number of descriptors that are shared between number of cpdma channels. Number of queues can be tuned with ethtool, that allows to not spend descriptors on not needed cpdma channels. In AVB usually only 2 tx queues can be enough with rate limitation. The rate limitation can be used only for hi priority queues. Thus, to use only 2 queues the 8 has to be created. It's wasteful. So, in order to allow using only needed number of rate limited tx queues, save resources, and be able to set rate limitation for them, let assign tx cpdma channels in backward order to queues. Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com> Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 20:34:36 -07:00
David S. Miller	b19c7bb1ac	mlx5e-updates-2018-07-18 This series includes update for mlx5e net device driver. 1) From Feras Daoud, Added the support for firmware log tracing, first by introducing the firmware API needed for the task and then For each PF do the following: 1- Allocate memory for the tracer strings database and read it from the FW to the SW. 2- Allocate and dma map tracer buffers. Traces that will be written into the buffer will be parsed as a group of one or more traces, referred to as trace message. The trace message represents a C-like printf string. Once a new trace is available FW will generate an event indicates new trace/s are available and the driver will parse them and dump them using tracepoints event tracing Enable mlx5 fw tracing by: echo 1 > /sys/kernel/debug/tracing/events/mlx5/mlx5_fw/enable Read traces by: cat /sys/kernel/debug/tracing/trace 2) From Roi Dayan, Remove redundant WARN when we cannot find neigh entry 3) From Jianbo Liu, TC double vlan support - Support offloading tc double vlan headers match - Support offloading double vlan push/pop tc actions 4) From Boris, re-visit UDP GSO, remove the splitting of UDP_GSO_L4 packets in the driver, and exposes UDP_GSO_L4 as a PARTIAL_GSO feature. -----BEGIN PGP SIGNATURE----- iQEbBAABAgAGBQJbVlEZAAoJEEg/ir3gV/o+x00H8gKfpMcKoDpT/EOq0NbCjnHI 87cxUqtk999TaoxD7YbNjQh6vyMvQOE6WwEZIIpvc6JzeSWtYN9FELyQC+deYH+/ 299WbfdiPxADfBB2DzbTlPhGOgaO26zA+yAYgdp7FW9M1r3USWExaUg1UzMTdxKR 4CsWUsG+yB3KlAKvuGjjRU1bN/+NivmK5mgT9PXd9m9fjobBENERU8dscCVmpMro o2z6ajKZ26a0jo0az99vDBUu6t1SC6QN1nJHY3iWBVY1Mvjy9XrcQ4LDR5wSjelU EiM9Hn2eVg5OddrlFEEi7yEeLHgtda3p/3qb1zx2YY9vuUM79R3MYz0uAPuaIw== =j+2g -----END PGP SIGNATURE----- Merge tag 'mlx5e-updates-2018-07-18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5e-updates-2018-07-18 This series includes update for mlx5e net device driver. 1) From Feras Daoud, Added the support for firmware log tracing, first by introducing the firmware API needed for the task and then For each PF do the following: 1- Allocate memory for the tracer strings database and read it from the FW to the SW. 2- Allocate and dma map tracer buffers. Traces that will be written into the buffer will be parsed as a group of one or more traces, referred to as trace message. The trace message represents a C-like printf string. Once a new trace is available FW will generate an event indicates new trace/s are available and the driver will parse them and dump them using tracepoints event tracing Enable mlx5 fw tracing by: echo 1 > /sys/kernel/debug/tracing/events/mlx5/mlx5_fw/enable Read traces by: cat /sys/kernel/debug/tracing/trace 2) From Roi Dayan, Remove redundant WARN when we cannot find neigh entry 3) From Jianbo Liu, TC double vlan support - Support offloading tc double vlan headers match - Support offloading double vlan push/pop tc actions 4) From Boris, re-visit UDP GSO, remove the splitting of UDP_GSO_L4 packets in the driver, and exposes UDP_GSO_L4 as a PARTIAL_GSO feature. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 20:22:33 -07:00
Florian Westphal	90fd131afc	netfilter: nf_tables: move dumper state allocation into ->start Shaochun Chen points out we leak dumper filter state allocations stored in dump_control->data in case there is an error before netlink sets cb_running (after which ->done will be called at some point). In order to fix this, add .start functions and do the allocations there. ->done is going to clean up, and in case error occurs before ->start invocation no cleanups need to be done anymore. Reported-by: shaochun chen <cscnull@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2018-07-24 00:36:33 +02:00
Boris Pismenny	3f44899ef2	net/mlx5e: Use PARTIAL_GSO for UDP segmentation This patch removes the splitting of UDP_GSO_L4 packets in the driver, and exposes UDP_GSO_L4 as a PARTIAL_GSO feature. Thus, the network stack is not responsible for splitting the packet into two. Signed-off-by: Boris Pismenny <borisp@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Jianbo Liu	cc495188a8	net/mlx5e: Support offloading double vlan push/pop tc actions As we can configure two push/pop actions in one flow table entry, add support to offload those double vlan actions in a rule to HW. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Jianbo Liu	1482bd3d50	net/mlx5e: Refactor tc vlan push/pop actions offloading Extract actions offloading code to a new function, and also extend data structures for double vlan actions. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Jianbo Liu	699e96ddf4	net/mlx5e: Support offloading tc double vlan headers match We can match on both outer and inner vlan tags, add support for offloading that. Signed-off-by: Jianbo Liu <jianbol@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Roi Dayan	c7f7ba8df8	net/mlx5e: Remove redundant WARN when we cannot find neigh entry It is possible for neigh entry not to exist if it was cleaned already. When we bring down an interface the neigh gets deleted but it could be that our listener for neigh event to clear the encap valid bit didn't start yet and the neigh update last used work is started first. In this scenario the encap entry has valid bit set but the neigh entry doesn't exist. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Paul Blakey <paulb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Saeed Mahameed	3101d1fc6b	net/mlx5: FW tracer, Add debug prints Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Feras Daoud	244069532f	net/mlx5: FW tracer, Enable tracing Add the tracer file to the makefile and add the init function to the load one flow. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Feras Daoud	70dd6fdb89	net/mlx5: FW tracer, parse traces and kernel tracing support For each message the driver should do the following: 1- Find the message string in the strings database 2- Count the param number of each message 3- Wait for the param events and accumulate them 4- Calculate the event timestamp using the local event timestamp and the first timestamp event following it. 5- Print message to trace log Enable the tracing by: echo 1 > /sys/kernel/debug/tracing/events/mlx5/mlx5_fw/enable Read traces by: cat /sys/kernel/debug/tracing/trace Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Feras Daoud	c71ad41ccb	net/mlx5: FW tracer, events handling The tracer has one event, event 0x26, with two subtypes: - Subtype 0: Ownership change - Subtype 1: Traces available An ownership change occurs in the following cases: 1- Owner releases his ownership, in this case, an event will be sent to inform others to reattempt acquire ownership. 2- Ownership was taken by a higher priority tool, in this case the owner should understand that it lost ownership, and go through tear down flow. The second subtype indicates that there are traces in the trace buffer, in this case, the driver polls the tracer buffer for new traces, parse them and prepares the messages for printing. The HW starts tracing from the first address in the tracer buffer. Driver receives an event notifying that new trace block exists. HW posts a timestamp event at the last 8B of every 256B block. Comparing the timestamp to the last handled timestamp would indicate that this is a new trace block. Once the new timestamp is detected, the entire block is considered valid. Block validation and parsing, should be done after copying the current block to a different location, in order to avoid block overwritten during processing. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Saeed Mahameed	e9cad2cea7	net/mlx5: FW tracer, register log buffer memory key Create a memory key and protection domain for the tracer log buffer. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Feras Daoud	48967ffdeb	net/mlx5: FW tracer, create trace buffer and copy strings database For each PF do the following: 1- Allocate memory for the tracer strings database and read the strings from the FW to the SW. These strings will be used later for parsing traces. 2- Allocate and dma map tracer buffers. Traces that will be written into the buffer will be parsed as a group of one or more traces, referred to as trace message. The trace message represents a C-like printf string. First trace of a message holds the pointer to the correct string in strings database. The following traces holds the variables of the message. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Feras Daoud	f53aaa31cc	net/mlx5: FW tracer, implement tracer logic Implement FW tracer logic and registers access, initialization and cleanup flows. Initializing the tracer will be part of load one flow, as multiple PFs will try to acquire ownership but only one will succeed and will be the tracer owner. Signed-off-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 15:01:11 -07:00
Saeed Mahameed	7854ac44fe	Merge branch 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 core infrastructure updates and fixes. From Eran: - Add MPEGC (Management PCIe General Configuration) registers and btis - Fix tristate and description for MLX5 module rom Feras: - Add hardware structures for the firmware tracer From Jainbo: - Core support for double vlan push/pop steering action From Max: - Add XRQ commands definitions From Noa: - Add missing SET_DRIVER_VERSION command translation From Roi: - Use ERR_CAST() instead of coding it From Tariq: - Better return types for CQE API Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>	2018-07-23 14:58:46 -07:00
David S. Miller	c9eaaa1773	Merge branch 'lan743x-Add-features-to-lan743x-driver' Bryan Whitehead says: ==================== lan743x: Add features to lan743x driver This patch series adds extra features to the lan743x driver. Updates for v4: Patch 6/8 - Modified get/set_wol to use super set of MAC and PHY driver support. Patch 7/9 - In set_eee, return the return value from phy_ethtool_set_eee. Updates for v3: Removed patch 9 from this series, regarding PTP support Patch 6/8 - Add call to phy_ethtool_get_wol to lan743x_ethtool_get_wol Patch 7/8 - Add call to phy_ethtool_set_eee on (!eee->eee_enabled) Updates for v2: Patch 3/9 - Used ARRAY_SIZE macro in lan743x_ethtool_get_ethtool_stats. Patch 5/9 - Used MAX_EEPROM_SIZE in lan743x_ethtool_set_eeprom. Patch 6/9 - Removed unnecessary read of PMT_CTL. Used CRC algorithm from lib. Removed PHY interrupt settings from lan743x_pm_suspend Change "#if CONFIG_PM" to "#ifdef CONFIG_PM" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 14:09:19 -07:00
Bryan Whitehead	43e8fe9b84	lan743x: Add RSS support Implement RSS support Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 14:09:19 -07:00
Bryan Whitehead	c9cf96bb5f	lan743x: Add EEE support Implement EEE support Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 14:09:19 -07:00
Bryan Whitehead	4d94282afd	lan743x: Add power management support Implement power management Supports suspend, resume, and Wake on LAN Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 14:09:18 -07:00
Bryan Whitehead	695846047a	lan743x: Add support for ethtool eeprom access Implement ethtool eeprom access Also provides access to OTP (One Time Programming) Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 14:09:18 -07:00
Bryan Whitehead	2958337d68	lan743x: Add support for ethtool message level Implement ethtool message level Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 14:09:18 -07:00
Bryan Whitehead	8114e8a2f1	lan743x: Add support for ethtool statistics Implement ethtool statistics Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 14:09:18 -07:00
Bryan Whitehead	63b92a91a4	lan743x: Add support for ethtool link settings Use default link setting functions Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 14:09:18 -07:00
Bryan Whitehead	0cf632265d	lan743x: Add support for ethtool get_drvinfo Implement ethtool get_drvinfo Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 14:09:18 -07:00
David S. Miller	8760c4d6d5	Merge branch 'sh_eth-clean-up-the-TSU-register-accessors' Sergei Shtylyov says: ==================== sh_eth: clean up the TSU register accessors Here's a set of 5 patches against DaveM's 'net-next.git' repo. They do a final clean up of the TSU register accessors... ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:34:51 -07:00
Sergei Shtylyov	51459d4c76	sh_eth: make sh_eth_tsu_{read\|write}_entry() prototypes symmetric sh_eth_tsu_read_entry() is still asymmetric with sh_eth_tsu_write_entry() WRT their prototypes -- make them symmetric by passing to the former a TSU register offset instead of its address and also adding the (now necessary) 'ndev' parameter... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:34:50 -07:00
Sergei Shtylyov	7a54c867ba	sh_eth: make sh_eth_tsu_write_entry() take 'offset' parameter We can add the TSU register base address to a TSU register offset right in sh_eth_tsu_write_entry(), no need to do it in its callers... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:34:50 -07:00
Sergei Shtylyov	ecbecb0a90	sh_eth: call sh_eth_tsu_get_offset() from TSU register accessors With sh_eth_tsu_get_offset() now actually returning TSU register's offset, we can at last use it in sh_eth_tsu_{read\|write}(). Somehow this saves 248 bytes of object code with AArch64 gcc 4.8.5... :-) Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:34:50 -07:00
Sergei Shtylyov	41414f0a85	sh_eth: make sh_eth_tsu_get_offset() match its name sh_eth_tsu_get_offset(), despite its name, returns a TSU register's address, not its offset. Make this function match its name and return a register's offset from the TSU registers base address instead. Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:34:50 -07:00
Sergei Shtylyov	388c4bb4dc	sh_eth: uninline sh_eth_tsu_get_offset() sh_eth_tsu_get_offset() is called several times by the driver, remove inline and move that function from the header to the driver itself to let gcc decide whether to expand it inline or not... Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:34:50 -07:00
David S. Miller	1a4f14bab1	Merge branch 'tcp-robust-ooo' Eric Dumazet says: ==================== Juha-Matti Tilli reported that malicious peers could inject tiny packets in out_of_order_queue, forcing very expensive calls to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for every incoming packet. With tcp_rmem[2] default of 6MB, the ooo queue could contain ~7000 nodes. This patch series makes sure we cut cpu cycles enough to render the attack not critical. We might in the future go further, like disconnecting or black-holing proven malicious flows. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:01:48 -07:00
Eric Dumazet	58152ecbbc	tcp: add tcp_ooo_try_coalesce() helper In case skb in out_or_order_queue is the result of multiple skbs coalescing, we would like to get a proper gso_segs counter tracking, so that future tcp_drop() can report an accurate number. I chose to not implement this tracking for skbs in receive queue, since they are not dropped, unless socket is disconnected. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:01:36 -07:00
Eric Dumazet	8541b21e78	tcp: call tcp_drop() from tcp_data_queue_ofo() In order to be able to give better diagnostics and detect malicious traffic, we need to have better sk->sk_drops tracking. Fixes: `9f5afeae51` ("tcp: use an RB tree for ooo receive queue") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:01:36 -07:00
Eric Dumazet	3d4bf93ac1	tcp: detect malicious patterns in tcp_collapse_ofo_queue() In case an attacker feeds tiny packets completely out of order, tcp_collapse_ofo_queue() might scan the whole rb-tree, performing expensive copies, but not changing socket memory usage at all. 1) Do not attempt to collapse tiny skbs. 2) Add logic to exit early when too many tiny skbs are detected. We prefer not doing aggressive collapsing (which copies packets) for pathological flows, and revert to tcp_prune_ofo_queue() which will be less expensive. In the future, we might add the possibility of terminating flows that are proven to be malicious. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:01:36 -07:00
Eric Dumazet	f4a3313d8e	tcp: avoid collapses in tcp_prune_queue() if possible Right after a TCP flow is created, receiving tiny out of order packets allways hit the condition : if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf) tcp_clamp_window(sk); tcp_clamp_window() increases sk_rcvbuf to match sk_rmem_alloc (guarded by tcp_rmem[2]) Calling tcp_collapse_ofo_queue() in this case is not useful, and offers a O(N^2) surface attack to malicious peers. Better not attempt anything before full queue capacity is reached, forcing attacker to spend lots of resource and allow us to more easily detect the abuse. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:01:36 -07:00
Eric Dumazet	72cd43ba64	tcp: free batches of packets in tcp_prune_ofo_queue() Juha-Matti Tilli reported that malicious peers could inject tiny packets in out_of_order_queue, forcing very expensive calls to tcp_collapse_ofo_queue() and tcp_prune_ofo_queue() for every incoming packet. out_of_order_queue rb-tree can contain thousands of nodes, iterating over all of them is not nice. Before linux-4.9, we would have pruned all packets in ofo_queue in one go, every XXXX packets. XXXX depends on sk_rcvbuf and skbs truesize, but is about 7000 packets with tcp_rmem[2] default of 6 MB. Since we plan to increase tcp_rmem[2] in the future to cope with modern BDP, can not revert to the old behavior, without great pain. Strategy taken in this patch is to purge ~12.5 % of the queue capacity. Fixes: `36a6503fed` ("tcp: refine tcp_prune_ofo_queue() to not drop all packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Juha-Matti Tilli <juha-matti.tilli@iki.fi> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 12:01:36 -07:00
Paolo Abeni	3dd1c9a127	ip: hash fragments consistently The skb hash for locally generated ip[v6] fragments belonging to the same datagram can vary in several circumstances: * for connected UDP[v6] sockets, the first fragment get its hash via set_owner_w()/skb_set_hash_from_sk() * for unconnected IPv6 UDPv6 sockets, the first fragment can get its hash via ip6_make_flowlabel()/skb_get_hash_flowi6(), if auto_flowlabel is enabled For the following frags the hash is usually computed via skb_get_hash(). The above can cause OoO for unconnected IPv6 UDPv6 socket: in that scenario the egress tx queue can be selected on a per packet basis via the skb hash. It may also fool flow-oriented schedulers to place fragments belonging to the same datagram in different flows. Fix the issue by copying the skb hash from the head frag into the others at fragmentation time. Before this commit: perf probe -a "dev_queue_xmit skb skb->hash skb->l4_hash:b1@0/8 skb->sw_hash:b1@1/8" netperf -H $IPV4 -t UDP_STREAM -l 5 -- -m 2000 -n & perf record -e probe:dev_queue_xmit -e probe:skb_set_owner_w -a sleep 0.1 perf script probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=3713014309 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=0 l4_hash=0 sw_hash=0 After this commit: probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 probe:dev_queue_xmit: (ffffffff8c6b1b20) hash=2171763177 l4_hash=1 sw_hash=0 Fixes: `b73c3d0e4f` ("net: Save TX flow hash in sock and set in skbuf on xmit") Fixes: `67800f9b1f` ("ipv6: Call skb_get_hash_flowi6 to get skb->hash in ip6_make_flowlabel") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 11:39:30 -07:00
Wei Wang	e873e4b9cc	ipv6: use fib6_info_hold_safe() when necessary In the code path where only rcu read lock is held, e.g. in the route lookup code path, it is not safe to directly call fib6_info_hold() because the fib6_info may already have been deleted but still exists in the rcu grace period. Holding reference to it could cause double free and crash the kernel. This patch adds a new function fib6_info_hold_safe() and replace fib6_info_hold() in all necessary places. Syzbot reported 3 crash traces because of this. One of them is: 8021q: adding VLAN 0 to HW filter on device team0 IPv6: ADDRCONF(NETDEV_CHANGE): team0: link becomes ready dst_release: dst:(____ptrval____) refcnt:-1 dst_release: dst:(____ptrval____) refcnt:-2 WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 dst_hold include/net/dst.h:239 [inline] WARNING: CPU: 1 PID: 4845 at include/net/dst.h:239 ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 dst_release: dst:(____ptrval____) refcnt:-1 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 4845 Comm: syz-executor493 Not tainted 4.18.0-rc3+ #10 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x2b4 lib/dump_stack.c:113 panic+0x238/0x4e7 kernel/panic.c:184 dst_release: dst:(____ptrval____) refcnt:-2 dst_release: dst:(____ptrval____) refcnt:-3 __warn.cold.8+0x163/0x1ba kernel/panic.c:536 dst_release: dst:(____ptrval____) refcnt:-4 report_bug+0x252/0x2d0 lib/bug.c:186 fixup_bug arch/x86/kernel/traps.c:178 [inline] do_error_trap+0x1fc/0x4d0 arch/x86/kernel/traps.c:296 dst_release: dst:(____ptrval____) refcnt:-5 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:316 invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:992 RIP: 0010:dst_hold include/net/dst.h:239 [inline] RIP: 0010:ip6_setup_cork+0xd66/0x1830 net/ipv6/ip6_output.c:1204 Code: c1 ed 03 89 9d 18 ff ff ff 48 b8 00 00 00 00 00 fc ff df 41 c6 44 05 00 f8 e9 2d 01 00 00 4c 8b a5 c8 fe ff ff e8 1a f6 e6 fa <0f> 0b e9 6a fc ff ff e8 0e f6 e6 fa 48 8b 85 d0 fe ff ff 48 8d 78 RSP: 0018:ffff8801a8fcf178 EFLAGS: 00010293 RAX: ffff8801a8eba5c0 RBX: 0000000000000000 RCX: ffffffff869511e6 RDX: 0000000000000000 RSI: ffffffff869515b6 RDI: 0000000000000005 RBP: ffff8801a8fcf2c8 R08: ffff8801a8eba5c0 R09: ffffed0035ac8338 R10: ffffed0035ac8338 R11: ffff8801ad6419c3 R12: ffff8801a8fcf720 R13: ffff8801a8fcf6a0 R14: ffff8801ad6419c0 R15: ffff8801ad641980 ip6_make_skb+0x2c8/0x600 net/ipv6/ip6_output.c:1768 udpv6_sendmsg+0x2c90/0x35f0 net/ipv6/udp.c:1376 inet_sendmsg+0x1a1/0x690 net/ipv4/af_inet.c:798 sock_sendmsg_nosec net/socket.c:641 [inline] sock_sendmsg+0xd5/0x120 net/socket.c:651 ___sys_sendmsg+0x51d/0x930 net/socket.c:2125 __sys_sendmmsg+0x240/0x6f0 net/socket.c:2220 __do_sys_sendmmsg net/socket.c:2249 [inline] __se_sys_sendmmsg net/socket.c:2246 [inline] __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2246 do_syscall_64+0x1b9/0x820 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x446ba9 Code: e8 cc bb 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 eb 08 fc ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fb39a469da8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133 RAX: ffffffffffffffda RBX: 00000000006dcc54 RCX: 0000000000446ba9 RDX: 00000000000000b8 RSI: 0000000020001b00 RDI: 0000000000000003 RBP: 00000000006dcc50 R08: 00007fb39a46a700 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 45c828efc7a64843 R13: e6eeb815b9d8a477 R14: 5068caf6f713c6fc R15: 0000000000000001 Dumping ftrace buffer: (ftrace buffer empty) Kernel Offset: disabled Rebooting in 86400 seconds.. Fixes: `93531c6743` ("net/ipv6: separate handling of FIB entries from dst based routes") Reported-by: syzbot+902e2a1bcd4f7808cef5@syzkaller.appspotmail.com Reported-by: syzbot+8ae62d67f647abeeceb9@syzkaller.appspotmail.com Reported-by: syzbot+3f08feb14086930677d0@syzkaller.appspotmail.com Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 11:19:02 -07:00
YueHaibing	fd800f6464	wan/fsl_ucc_hdlc: use IS_ERR_VALUE() to check return value of qe_muram_alloc qe_muram_alloc return a unsigned long integer,which should not compared with zero. check it using IS_ERR_VALUE() to fix this. Fixes: `c19b6d246a` ("drivers/net: support hdlc function for QE-UCC") Signed-off-by: YueHaibing <yuehaibing@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 11:07:10 -07:00
David S. Miller	5302a84e37	linux-can-fixes-for-4.18-20180723 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEENrCndlB/VnAEWuH5k9IU1zQoZfEFAltVzJQTHG1rbEBwZW5n dXRyb25peC5kZQAKCRCT0hTXNChl8e/pCADT/Al1tOpzM0EYs3fdSDzqi3caxkAI gpxGDcM8fpiJn5psC0DvFJ4vf8GBQBdkoS8W6M3ieSxrwxvFlK0hQsujbBn4vAGt WaKgDYvXJY5PJBjwwHSFMHRVGmzzg7YOQ3CqFsoHlVjr+gPA6T4qgclnPQrUgWQY y6IuQfoAagsP3ezDV15hiRhPaI4SJrCK27XfAD8Cbr9rrl7sa4ifsP20Wf2xoFDu lbf/bVMjOiYANrga4Pz8PNlFiIX9F3kW0Qc81eyJkvBEnmxRThZ7nvP/DHd3vVoF k2EUBaGhdyu5URHASTZRKnT2WKTbeAJ48wFhwT9QCD169SrZWhv7r/nP =wU3+ -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-4.18-20180723' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2018-07-23 this is a pull request of 12 patches for net/master. The patch by Stephane Grosjean for the peak_canfd CAN driver fixes a problem with older firmware. The next patch is by Roman Fietze and fixes the setup of the CCCR register in the m_can driver. Nicholas Mc Guire's patch for the mpc5xxx_can driver adds missing error checking. The two patches by Faiz Abbas fix the runtime resume and clean up the probe function in the m_can driver. The last 7 patches by Anssi Hannula fix several problem in the xilinx_can driver. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 11:02:21 -07:00
David S. Miller	6a525818b1	Merge branch 'smc-next' Ursula Braun says: ==================== net/smc: patches 2018-07-23 here are some small patches for SMC: Just the first patch contains a functional change. It allows to differ between the modes SMCR and SMCD on s390 when monitoring SMC sockets. The remaining patches are cleanups without functional changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Ursula Braun	48bf523177	net/smc: remove local variable page in smc_rx_splice() The page map address is already stored in the RMB descriptor. There is no need to derive it from the cpu_addr value. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Ursula Braun	144ce4b9b5	net/smc: use DECLARE_BITMAP for rtokens_used_mask Link group field tokens_used_mask is a bitmap. Use macro DECLARE_BITMAP for its definition. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Stefan Raspl	00e5fb263f	net/smc: add function to get link group from link Replace a frequently used construct with a more readable variant, reducing the code. Also might come handy when we start to support more than a single per link group. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00
Stefan Raspl	bac6de7b63	net/smc: eliminate cursor read and write calls The functions to read and write cursors are exclusively used to copy cursors. Therefore switch to a respective function instead. Signed-off-by: Stefan Raspl <raspl@linux.ibm.com> Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-07-23 10:57:14 -07:00

... 4 5 6 7 8 ...

769455 Commits