linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Thomas Gleixner	25763b3c86	treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 206 Based on 1 normalized pattern(s): this program is free software you can redistribute it and or modify it under the terms of version 2 of the gnu general public license as published by the free software foundation extracted by the scancode license scanner the SPDX license identifier GPL-2.0-only has been chosen to replace the boilerplate/reference in 107 file(s). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Allison Randal <allison@lohutok.net> Reviewed-by: Richard Fontana <rfontana@redhat.com> Reviewed-by: Steve Winslow <swinslow@gmail.com> Reviewed-by: Alexios Zavras <alexios.zavras@intel.com> Cc: linux-spdx@vger.kernel.org Link: https://lkml.kernel.org/r/20190528171438.615055994@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-05-30 11:29:53 -07:00
Dean Nelson	cd35ef9149	thunderx: eliminate extra calls to put_page() for pages held for recycling For the non-XDP case, commit `773225388d` ("net: thunderx: Optimize page recycling for XDP") added code to nicvf_free_rbdr() that, when releasing the additional receive buffer page reference held for recycling, repeatedly calls put_page() until the page's _refcount goes to zero. Which results in the page being freed. This is not okay if the page's _refcount was greater than 1 (in the non-XDP case), because nicvf_free_rbdr() should not be subtracting more than what nicvf_alloc_page() had previously added to the page's _refcount, which was only 1 (in the non-XDP case). This can arise if a received packet is still being processed and the receive buffer (i.e., skb->head) has not yet been freed via skb_free_head() when nicvf_free_rbdr() is spinning through the aforementioned put_page() loop. If this should occur, when the received packet finishes processing and skb_free_head() is called, various problems can ensue. Exactly what, depends on whether the page has already been reallocated or not, anything from "BUG: Bad page state ... ", to "Unable to handle kernel NULL pointer dereference ..." or "Unable to handle kernel paging request...". So this patch changes nicvf_free_rbdr() to only call put_page() once for pages held for recycling (in the non-XDP case). Fixes: `773225388d` ("net: thunderx: Optimize page recycling for XDP") Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-27 22:52:28 -07:00
Dean Nelson	b3e2080694	thunderx: enable page recycling for non-XDP case Commit `773225388d` ("net: thunderx: Optimize page recycling for XDP") added code to nicvf_alloc_page() that inadvertently disables receive buffer page recycling for the non-XDP case by always NULL'ng the page pointer. This patch corrects two if-conditionals to allow for the recycling of non-XDP mode pages by only setting the page pointer to NULL when the page is not ready for recycling. Fixes: `773225388d` ("net: thunderx: Optimize page recycling for XDP") Signed-off-by: Dean Nelson <dnelson@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2019-03-27 22:52:28 -07:00
Luis Chamberlain	750afb08ca	cross-tree: phase out dma_zalloc_coherent() We already need to zero out memory for dma_alloc_coherent(), as such using dma_zalloc_coherent() is superflous. Phase it out. This change was generated with the following Coccinelle SmPL patch: @ replace_dma_zalloc_coherent @ expression dev, size, data, handle, flags; @@ -dma_zalloc_coherent(dev, size, handle, flags) +dma_alloc_coherent(dev, size, handle, flags) Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org> [hch: re-ran the script on the latest tree] Signed-off-by: Christoph Hellwig <hch@lst.de>	2019-01-08 07:58:37 -05:00
Lorenzo Bianconi	ef2a7cf1d8	net: thunderx: set tso_hdrs pointer to NULL in nicvf_free_snd_queue Reset snd_queue tso_hdrs pointer to NULL in nicvf_free_snd_queue routine since it is used to check if tso dma descriptor queue has been previously allocated. The issue can be triggered with the following reproducer: $ip link set dev enP2p1s0v0 xdpdrv obj xdp_dummy.o $ip link set dev enP2p1s0v0 xdpdrv off [ 341.467649] WARNING: CPU: 74 PID: 2158 at mm/vmalloc.c:1511 __vunmap+0x98/0xe0 [ 341.515010] Hardware name: GIGABYTE H270-T70/MT70-HD0, BIOS T49 02/02/2018 [ 341.521874] pstate: 60400005 (nZCv daif +PAN -UAO) [ 341.526654] pc : __vunmap+0x98/0xe0 [ 341.530132] lr : __vunmap+0x98/0xe0 [ 341.533609] sp : ffff00001c5db860 [ 341.536913] x29: ffff00001c5db860 x28: 0000000000020000 [ 341.542214] x27: ffff810feb5090b0 x26: ffff000017e57000 [ 341.547515] x25: 0000000000000000 x24: 00000000fbd00000 [ 341.552816] x23: 0000000000000000 x22: ffff810feb5090b0 [ 341.558117] x21: 0000000000000000 x20: 0000000000000000 [ 341.563418] x19: ffff000017e57000 x18: 0000000000000000 [ 341.568719] x17: 0000000000000000 x16: 0000000000000000 [ 341.574020] x15: 0000000000000010 x14: ffffffffffffffff [ 341.579321] x13: ffff00008985eb27 x12: ffff00000985eb2f [ 341.584622] x11: ffff0000096b3000 x10: ffff00001c5db510 [ 341.589923] x9 : 00000000ffffffd0 x8 : ffff0000086868e8 [ 341.595224] x7 : 3430303030303030 x6 : 00000000000006ef [ 341.600525] x5 : 00000000003fffff x4 : 0000000000000000 [ 341.605825] x3 : 0000000000000000 x2 : ffffffffffffffff [ 341.611126] x1 : ffff0000096b3728 x0 : 0000000000000038 [ 341.616428] Call trace: [ 341.618866] __vunmap+0x98/0xe0 [ 341.621997] vunmap+0x3c/0x50 [ 341.624961] arch_dma_free+0x68/0xa0 [ 341.628534] dma_direct_free+0x50/0x80 [ 341.632285] nicvf_free_resources+0x160/0x2d8 [nicvf] [ 341.637327] nicvf_config_data_transfer+0x174/0x5e8 [nicvf] [ 341.642890] nicvf_stop+0x298/0x340 [nicvf] [ 341.647066] __dev_close_many+0x9c/0x108 [ 341.650977] dev_close_many+0xa4/0x158 [ 341.654720] rollback_registered_many+0x140/0x530 [ 341.659414] rollback_registered+0x54/0x80 [ 341.663499] unregister_netdevice_queue+0x9c/0xe8 [ 341.668192] unregister_netdev+0x28/0x38 [ 341.672106] nicvf_remove+0xa4/0xa8 [nicvf] [ 341.676280] nicvf_shutdown+0x20/0x30 [nicvf] [ 341.680630] pci_device_shutdown+0x44/0x88 [ 341.684720] device_shutdown+0x144/0x250 [ 341.688640] kernel_restart_prepare+0x44/0x50 [ 341.692986] kernel_restart+0x20/0x68 [ 341.696638] __se_sys_reboot+0x210/0x238 [ 341.700550] __arm64_sys_reboot+0x24/0x30 [ 341.704555] el0_svc_handler+0x94/0x110 [ 341.708382] el0_svc+0x8/0xc [ 341.711252] ---[ end trace 3f4019c8439959c9 ]--- [ 341.715874] page:ffff7e0003ef4000 count:0 mapcount:0 mapping:0000000000000000 index:0x4 [ 341.723872] flags: 0x1fffe000000000() [ 341.727527] raw: 001fffe000000000 ffff7e0003f1a008 ffff7e0003ef4048 0000000000000000 [ 341.735263] raw: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000 [ 341.742994] page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0) where xdp_dummy.c is a simple bpf program that forwards the incoming frames to the network stack (available here: https://github.com/altoor/xdp_walkthrough_examples/blob/master/sample_1/xdp_dummy.c) Fixes: `05c773f52b` ("net: thunderx: Add basic XDP support") Fixes: `4863dea3fa` ("net: Adding support for Cavium ThunderX network controller") Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-23 22:31:56 -08:00
Kees Cook	6396bb2215	treewide: kzalloc() -> kcalloc() The kzalloc() function has a 2-factor argument form, kcalloc(). This patch replaces cases of: kzalloc(a * b, gfp) with: kcalloc(a * b, gfp) as well as handling cases of: kzalloc(a * b * c, gfp) with: kzalloc(array3_size(a, b, c), gfp) as it's slightly less ugly than: kzalloc_array(array_size(a, b), c, gfp) This does, however, attempt to ignore constant size factors like: kzalloc(4 * 1024, gfp) though any constants defined via macros get caught up in the conversion. Any factors with a sizeof() of "unsigned char", "char", and "u8" were dropped, since they're redundant. The Coccinelle script used for this was: // Fix redundant parens around sizeof(). @@ type TYPE; expression THING, E; @@ ( kzalloc( - (sizeof(TYPE)) * E + sizeof(TYPE) * E , ...) \| kzalloc( - (sizeof(THING)) * E + sizeof(THING) * E , ...) ) // Drop single-byte sizes and redundant parens. @@ expression COUNT; typedef u8; typedef __u8; @@ ( kzalloc( - sizeof(u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(__u8) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * (COUNT) + COUNT , ...) \| kzalloc( - sizeof(u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(__u8) * COUNT + COUNT , ...) \| kzalloc( - sizeof(char) * COUNT + COUNT , ...) \| kzalloc( - sizeof(unsigned char) * COUNT + COUNT , ...) ) // 2-factor product with sizeof(type/expression) and identifier or constant. @@ type TYPE; expression THING; identifier COUNT_ID; constant COUNT_CONST; @@ ( - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_ID) + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_ID + COUNT_ID, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (COUNT_CONST) + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * COUNT_CONST + COUNT_CONST, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_ID) + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_ID + COUNT_ID, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (COUNT_CONST) + COUNT_CONST, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * COUNT_CONST + COUNT_CONST, sizeof(THING) , ...) ) // 2-factor product, only identifiers. @@ identifier SIZE, COUNT; @@ - kzalloc + kcalloc ( - SIZE * COUNT + COUNT, SIZE , ...) // 3-factor product with 1 sizeof(type) or sizeof(expression), with // redundant parens removed. @@ expression THING; identifier STRIDE, COUNT; type TYPE; @@ ( kzalloc( - sizeof(TYPE) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(TYPE) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(TYPE)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * (COUNT) * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * (STRIDE) + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) \| kzalloc( - sizeof(THING) * COUNT * STRIDE + array3_size(COUNT, STRIDE, sizeof(THING)) , ...) ) // 3-factor product with 2 sizeof(variable), with redundant parens removed. @@ expression THING1, THING2; identifier COUNT; type TYPE1, TYPE2; @@ ( kzalloc( - sizeof(TYPE1) * sizeof(TYPE2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(THING1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(THING1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * COUNT + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) \| kzalloc( - sizeof(TYPE1) * sizeof(THING2) * (COUNT) + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2)) , ...) ) // 3-factor product, only identifiers, with redundant parens removed. @@ identifier STRIDE, SIZE, COUNT; @@ ( kzalloc( - (COUNT) * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * STRIDE * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - (COUNT) * (STRIDE) * (SIZE) + array3_size(COUNT, STRIDE, SIZE) , ...) \| kzalloc( - COUNT * STRIDE * SIZE + array3_size(COUNT, STRIDE, SIZE) , ...) ) // Any remaining multi-factor products, first at least 3-factor products, // when they're not all constants... @@ expression E1, E2, E3; constant C1, C2, C3; @@ ( kzalloc(C1 * C2 * C3, ...) \| kzalloc( - (E1) * E2 * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * E3 + array3_size(E1, E2, E3) , ...) \| kzalloc( - (E1) * (E2) * (E3) + array3_size(E1, E2, E3) , ...) \| kzalloc( - E1 * E2 * E3 + array3_size(E1, E2, E3) , ...) ) // And then all remaining 2 factors products when they're not all constants, // keeping sizeof() as the second factor argument. @@ expression THING, E1, E2; type TYPE; constant C1, C2, C3; @@ ( kzalloc(sizeof(THING) * C2, ...) \| kzalloc(sizeof(TYPE) * C2, ...) \| kzalloc(C1 * C2 * C3, ...) \| kzalloc(C1 * C2, ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * (E2) + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(TYPE) * E2 + E2, sizeof(TYPE) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * (E2) + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - sizeof(THING) * E2 + E2, sizeof(THING) , ...) \| - kzalloc + kcalloc ( - (E1) * E2 + E1, E2 , ...) \| - kzalloc + kcalloc ( - (E1) * (E2) + E1, E2 , ...) \| - kzalloc + kcalloc ( - E1 * E2 + E1, E2 , ...) ) Signed-off-by: Kees Cook <keescook@chromium.org>	2018-06-12 16:19:22 -07:00
Jesper Dangaard Brouer	e6dbe9397e	Revert "net: thunderx: Add support for xdp redirect" This reverts commit `aa136d0c82`. As I previously[1] pointed out this implementation of XDP_REDIRECT is wrong. XDP_REDIRECT is a facility that must work between different NIC drivers. Another NIC driver can call ndo_xdp_xmit/nicvf_xdp_xmit, but your driver patch assumes payload data (at top of page) will contain a queue index and a DMA addr, this is not true and worse will likely contain garbage. Given you have not fixed this in due time (just reached v4.16-rc1), the only option I see is a revert. [1] http://lkml.kernel.org/r/20171211130902.482513d3@redhat.com Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Christina Jacob <cjacob@caviumnetworks.com> Cc: Aleksey Makarov <aleksey.makarov@cavium.com> Fixes: `aa136d0c82` ("net: thunderx: Add support for xdp redirect") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-14 14:23:39 -05:00
Sunil Goutham	4a87550964	net: thunderx: add timestamping support This adds timestamping support for both receive and transmit paths. On the receive side no filters are supported i.e either all pkts will get a timestamp appended infront of the packet or none. On the transmit side HW doesn't support timestamp insertion but only generates a separate CQE with transmitted packet's timestamp. Also HW supports only one packet at a time for timestamping on the transmit side. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@cavium.com> Acked-by: Philippe Ombredanne <pombredanne@nexb.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-16 14:31:14 -05:00
Jesper Dangaard Brouer	27e95e3648	thunderx: setup xdp_rxq_info This driver uses a bool scheme for "enable"/"disable" when setting up different resources. Thus, the hook points for xdp_rxq_info is done in the same function call nicvf_rcv_queue_config(). This is activated through enable/disable via nicvf_config_data_transfer(), which is tied into nicvf_stop()/nicvf_open(). Extending driver packet handler call-path nicvf_rcv_pkt_handler() with a pointer to the given struct rcv_queue, in-order to access the xdp_rxq_info data area (in nicvf_xdp_rx()). V2: Driver have no proper error path for failed XDP RX-queue info reg, as nicvf_rcv_queue_config is a void function. Cc: linux-arm-kernel@lists.infradead.org Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Robert Richter <rric@kernel.org> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-05 15:21:21 -08:00
David S. Miller	51e18a453f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflict was two parallel additions of include files to sch_generic.c, no biggie. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-09 22:09:55 -05:00
Florian Westphal	134059fd27	net: thunderx: Fix TCP/UDP checksum offload for IPv4 pkts Offload IP header checksum to NIC. This fixes a previous patch which disabled checksum offloading for both IPv4 and IPv6 packets. So L3 checksum offload was getting disabled for IPv4 pkts. And HW is dropping these pkts for some reason. Without this patch, IPv4 TSO appears to be broken: WIthout this patch I get ~16kbyte/s, with patch close to 2mbyte/s when copying files via scp from test box to my home workstation. Looking at tcpdump on sender it looks like hardware drops IPv4 TSO skbs. This patch restores performance for me, ipv6 looks good too. Fixes: `fa6d7cb5d7` ("net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts") Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Aleksey Makarov <aleksey.makarov@auriga.com> Cc: Eric Dumazet <edumazet@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-06 14:43:31 -05:00
Sunil Goutham	aa136d0c82	net: thunderx: Add support for xdp redirect This patch adds support for XDP_REDIRECT. Flush is not yet supported. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: cjacob <cjacob@caviumnetworks.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-30 09:24:07 -05:00
Sunil Goutham	fa6d7cb5d7	net: thunderx: Fix TCP/UDP checksum offload for IPv6 pkts Don't offload IP header checksum to NIC. This fixes a previous patch which enabled checksum offloading for both IPv4 and IPv6 packets. So L3 checksum offload was getting enabled for IPv6 pkts. And HW is dropping these pkts as it assumes the pkt is IPv4 when IP csum offload is set in the SQ descriptor. Fixes: `3a9024f52c` ("net: thunderx: Enable TSO and checksum offloads for ipv6") Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: Aleksey Makarov <aleksey.makarov@auriga.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-25 23:54:32 +09:00
Joe Perches	bf24e136a3	cavium: thunder: Remove duplicate "netdev->name" logging output Using netdev_<level>(netdev, "%s: ...", netdev->name) duplicates the name in the output. Remove those uses. Miscellanea: o Use the netif_<level> convenience macros at the same time Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-06-29 12:25:33 -04:00
Sunil Goutham	773225388d	net: thunderx: Optimize page recycling for XDP Driver follows a method of taking one extra reference on the page for recycling which is fine in usual packet path where each 64KB page is segmented into multiple receive buffers. But in XDP mode since there is just one receive buffer per page taking extra page reference itself becomes big bottleneck consuming ~50% of CPU cycles due to atomic operations. This patch adds a internal ref count in pgcache for each page and additional page references are taken in a batch instead of just one at a time. Internal i.e 'pgcache->ref_count' and page's i.e 'page->_refcount' counters are compared to check page's recyclability. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:22 -04:00
Sunil Goutham	e3d06ff9ec	net: thunderx: Support for XDP header adjustment When in XDP mode reserve XDP_PACKET_HEADROOM bytes at the start of receive buffer for XDP program to modify headers and adjust packet start. Additional code changes done to handle such packets. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:22 -04:00
Sunil Goutham	16f2bccda7	net: thunderx: Add support for XDP_TX Adds support for XDP_TX i.e transmits packet out of the XDP TX queue mapped to the corresponding Rx queue on which packet is received. Since SQ for XDP TX will be used only on a single cpu i.e SQ description creation and freeing, using atomic free count is not necessary and will become a bottleneck. Hence added a separate 'xdp_free_cnt' used for SQs designated for XDP to track descriptor free count. Changes also include - A new entry 'xdp_page' is added to save transmitted packet's page pointer for later cleanup. - XDP Tx SQ's doorbell is ringed once per NAPI instance. - Retrieving designated SQ for packets being sent out by stack via 'nicvf_xmit'. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:22 -04:00
Sunil Goutham	c56d91ce38	net: thunderx: Add support for XDP_DROP Adds support for XDP_DROP. Also since in XDP mode there is just a single buffer per page, made changes to recycle DMA mapping info as well along with pages. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:21 -04:00
Sunil Goutham	05c773f52b	net: thunderx: Add basic XDP support Adds basic XDP support i.e attaching a BPF program to an interface. Also takes care of allocating separate Tx queues for XDP path and for network stack packet transmission. This patch doesn't support handling of any of the XDP actions, all are treated as XDP_PASS i.e packets will be handed over to the network stack. Changes also involve allocating one receive buffer per page in XDP mode and multiple in normal mode i.e when no BPF program is attached. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:21 -04:00
Sunil Goutham	927987f39f	net: thunderx: Cleanup receive buffer allocation Get rid of unnecessary double pointer references and type casting in receive buffer allocation code. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:21 -04:00
Sunil Goutham	0dada88b8c	net: thunderx: Optimize CQE_TX handling Optimized CQE handling with below changes - Feeing descriptors back to SQ in bulk i.e once per NAPI instance instead for every CQE_TX, this will reduce number of atomic updates to 'sq->free_cnt'. - Checking errors in CQE_TX and CQE_RX before calling appropriate fn()s to update error stats i.e reduce branching. Also removed debug messages in packet handling path which otherwise causes issues if DEBUG is enabled. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:21 -04:00
Sunil Goutham	5e848e4c5d	net: thunderx: Optimize RBDR descriptor handling Receive buffer's physical address or iova will anyway not go beyond 49bits, since it is the max supported HW address. As per perf, updating bitfields i.e buf_addr:42 in RBDR descriptor entry consumes lots of cpu cycles, hence changed it to a 64bit field with alignment requirements taken care of. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:20 -04:00
Sunil Goutham	5836b44297	net: thunderx: Support for page recycling Adds support for page recycling for allocating receive buffers to reduce cost of refilling RBDR ring. Also got rid of using compound pages when pagesize is 4K, only order-0 pages now. Only page is recycled, DMA mappings still needs to be done for every receive buffer allocated due to following constraints - Cannot have just one receive buffer per 64KB page. - There is just one buffer ring shared across 8 Rx queues, so buffers of same page can go to any Rx queue. - HW gives buffer address where packet has been DMA'ed and not the index into buffer ring. This makes it not possible to resue DMA mapping info. So unfortunately have to go through costly mapping route for every buffer. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-05-02 15:41:20 -04:00
Thanneeru Srinivasulu	3a9024f52c	net: thunderx: Enable TSO and checksum offloads for ipv6 Adding support for TSO and checksum hardware offloads for ipv6. Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-07 14:01:07 -07:00
Thanneeru Srinivasulu	36fa35d22b	net: thunderx: Allow IPv6 frames with zero UDP checksum Do not consider IPv6 frames with zero UDP checksum as frames with bad checksum and drop them. Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 13:12:41 -08:00
Sunil Goutham	83abb7d7c9	net: thunderx: Fix IOMMU translation faults ACPI support has been added to ARM IOMMU driver in 4.10 kernel and that has resulted in VNIC interfaces throwing translation faults when kernel is booted with ACPI as driver was not using DMA API. This patch fixes the issue by using DMA API which inturn will create translation tables when IOMMU is enabled. Also VNIC doesn't have a seperate receive buffer ring per receive queue, so there is no 1:1 descriptor index matching between CQE_RX and the index in buffer ring from where a buffer has been used for DMA'ing. Unlike other NICs, here it's not possible to maintain dma address to virt address mappings within the driver. This leaves us no other choice but to use IOMMU's IOVA address conversion API to get buffer's virtual address which can be given to network stack for processing. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-03-09 13:12:41 -08:00
Sunil Goutham	fff4ffdde1	net: thunderx: Support to configure queue sizes from ethtool Adds support to set Rx/Tx queue sizes from ethtool. Fixes an issue with retrieving queue size. Also sets SQ's CQ_LIMIT based on configured Tx queue size such that HW doesn't process SQEs when there is no sufficient space in CQ. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-01-25 14:42:36 -05:00
Sunil Goutham	bd3ad7d3a1	net: thunderx: Fix transmit queue timeout issue Transmit queue timeout issue is seen in two cases - Due to a race condition btw setting stop_queue at xmit() and checking for stopped_queue in NAPI poll routine, at times transmission from a SQ comes to a halt. This is fixed by using barriers and also added a check for SQ free descriptors, incase SQ is stopped and there are only CQE_RX i.e no CQE_TX. - Contrary to an assumption, a HW errata where HW doesn't stop transmission even though there are not enough CQEs available for a CQE_TX is not fixed in T88 pass 2.x. This results in a Qset error with 'CQ_WR_FULL' stalling transmission. This is fixed by adjusting RXQ's RED levels for CQ level such that there is always enough space left for CQE_TXs. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-12-02 13:32:59 -05:00
Sunil Goutham	d5b2d7a718	net: thunderx: Configure RED and backpressure levels This patch enables moving average calculation of Rx pkt's resources and configures RED and backpressure levels for both CQ and RBDR. Also initialize SQ's CQ_LIMIT properly. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:21:17 -05:00
Sunil Goutham	c94acf805d	net: thunderx: Fix memory leak and other issues upon interface toggle This patch fixes the following 1. When interface is being teardown and queues are being cleaned up, free pending SKBs that are in SQ which are either not transmitted or freed as NAPI is disabled by that time. 2. While interface initialization, delay CFG_DONE notification till the end to avoid corner cases where TXQs are enabled but CQ interrupts are not which results blocking transmission and kicking off watchdog. 3. Check for IFF_UP while re-enabling RBDR interrupts from tasklet. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 13:28:33 -05:00
Sunil Goutham	964cb69bdc	net: thunderx: Fix VF driver's interface statistics This patch fixes multiple issues 1. Convert all driver statistics to percpu counters for accuracy. 2. To avoid multiple CQEs posted by a TSO packet appended to HW, TSO pkt's SQE has 'post_cqe' not set but a dummy SQE is added for getting HW transmit completion notification. This dummy SQE has 'dont_send' set and HW drops the pkt pointed to in this thus Tx drop counter increases. This patch fixes this by subtracting SW tx tso counter from HW Tx drop counter for actual packet drop counter. 3. Reset all individual queue's and VNIC HW stats when interface is going down. 4. Getrid off unnecessary counters in hot path. 5. Bringout all CQE error stats i.e both Rx and Tx. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 13:28:33 -05:00
Sunil Goutham	cadcf95a4f	net: thunderx: Fix configuration of L3/L4 length checking This patch fixes enabling of HW verification of L3/L4 length and TCP/UDP checksum which is currently being cleared. Also fixed VLAN stripping config which is being cleared when multiqset is enabled. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 13:28:33 -05:00
Sunil Goutham	712c318534	net: thunderx: Program LMAC credits based on MTU Programming LMAC credits taking 9K frame size by default is incorrect as for an interface which is one of the many on the same BGX/QLM no of credits available will be less as Tx FIFO will be divided across all interfaces. So let's say a BGX with 40G interface and another BGX with multiple 10G, bandwidth of 10G interfaces will be effected when traffic is running on both 40G and 10G interfaces simultaneously. This patch fixes this issue by programming credits based on netdev's MTU. Also fixed configuring MTU to HW and added CQE counter for pkts which exceed this value. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-16 13:28:33 -05:00
Sunil Goutham	2c204c2b9f	net: thunderx: Support for byte queue limits This patch adds support for byte queue limits Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-24 08:47:15 -04:00
David S. Miller	b20b378d49	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/ethernet/mediatek/mtk_eth_soc.c drivers/net/ethernet/qlogic/qed/qed_dcbx.c drivers/net/phy/Kconfig All conflicts were cases of overlapping commits. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-12 15:52:44 -07:00
Sunil Goutham	7ceb8a1319	net: thunderx: Fix for issues with multiple CQEs posted for a TSO packet On ThunderX 88xx pass 2.x chips when TSO is offloaded to HW, HW posts a CQE for every TSO segment transmitted. Current code does handles this, but is prone to issues when segment sizes are small resulting in SW processing too many CQEs and also at times frees a SKB which is not yet transmitted. This patch handles the errata in a different way and eliminates issues with earlier approach, TSO packet is submitted to HW with post_cqe=0, so that no CQE is posted upon completion of transmission of TSO packet but a additional HDR + IMMEDIATE descriptors are added to SQ due to which a CQE is posted and will have required info to be used while cleanup in napi. This way only one CQE is posted for a TSO packet. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-01 14:50:47 -07:00
Jerin Jacob	3458c40d60	net: thunderx: Reset RXQ HW stats when interface is brought down When SQ/TXQ is reclaimed i.e reset it's stats also automatically reset by HW. This is not the case with RQ. Also VF doesn't have write access to statistics counter registers. Hence a new Mbox msg is introduced which supports resetting RQ, SQ and full Qset stats. Currently only RQ stats are being reset using this mbox message. Signed-off-by: Jerin Jacob <jerin.jacob@caviumnetworks.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-13 11:59:32 -07:00
Sunil Goutham	a8671acca8	net: thunderx: Use skb_add_rx_frag() for split buffer Rx pkts Instead of a round about way of converting buffers to SKBs and combining them into a frag list, use standard skb_add_rx_frag() API to merge page fragments. This code is useful when incoming packets are of size more than RCV_FRAG_LEN which is currently set to 2048bytes. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-13 11:59:32 -07:00
Sunil Goutham	02a72bd8cd	net: thunderx: Enable CQE_RX desc's extension fields Unlike 88xx, CQE_RX descriptor's tunnelling extension i.e CQE_RX2_S is always enabled on 81xx/83xx and HW does insert these fields into CQE_RX. As a result receive buffer addresses will now be present at 7th word of CQE_RX instead of 6th. Enable CQE_RX2_S on 88xx pass 2.x as well. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-13 11:59:30 -07:00
Sunil Goutham	3a397ebe15	net: thunderx: Set queue count based on number of CPUs 81xx has only 4 CPUs, so it doesn't make sense to initialize entire Qset i.e 8 queues by default. Made changes to queue initialization to init queues equal to number of CPUs or 8 queues whichever is lesser. Also this will be applicable to VMs with VNIC VF attached and having less VCPUs Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-08-13 11:59:30 -07:00
Joonsoo Kim	6d061f9f61	mm/page_ref: use page_ref helper instead of direct modification of _count page_reference manipulation functions are introduced to track down reference count change of the page. Use it instead of direct modification of _count. Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Hugh Dickins <hughd@google.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: "David S. Miller" <davem@davemloft.net> Cc: Sunil Goutham <sgoutham@cavium.com> Cc: Chris Metcalf <cmetcalf@mellanox.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-05-19 19:12:14 -07:00
xypron.glpk@gmx.de	161de2caf6	net: thunderx: avoid exposing kernel stack Reserved fields should be set to zero to avoid exposing bits from the kernel stack. Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-05-10 00:36:09 -04:00
Sunil Goutham	5c2e26f6f6	net: thunderx: Set recevie buffer page usage count in bulk Instead of calling get_page() for every receive buffer carved out of page, set page's usage count at the end, to reduce no of atomic calls. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-03-14 12:33:36 -04:00
David S. Miller	b633353115	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts: drivers/net/phy/bcm7xxx.c drivers/net/phy/marvell.c drivers/net/vxlan.c All three conflicts were cases of simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-23 00:09:14 -05:00
Sunil Goutham	ad2ecebd67	net: thunderx: Fix receive packet stats Counting rx packets for every CQE_RX in CQ irq handler is incorrect. Synchronization is missing when multiple queues are receiving packets simultaneously. Like transmit packet stats use HW stats here. Also removed unused 'cqe_type' parameter in nicvf_rcv_pkt_handler(). Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-17 22:24:57 -05:00
Sunil Goutham	6e4be8d671	net: thunderx: Alloc higher order pages when pagesize is small Allocate higher order pages when pagesize is small, this will reduce number of calls to page allocator and wastage of memory. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-11 11:30:26 -05:00
Thanneeru Srinivasulu	a05d484590	net, thunderx: Add TX timeout and RX buffer alloc failure stats. When system is low on atomic memory, too many error messages are logged. Since this is not a total failure but a simple switch to non-atomic allocation better to have a stat. Also add a stat for reset, kicked due to transmit watchdog timeout. Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@caviumnetworks.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-02-11 11:30:26 -05:00
Sunil Goutham	b9687b48a6	net: thunderx: Enable CQE count threshold interrupt This feature is introduced in pass-2 chip and with this CQ interrupt coalescing will work based on both timer and count. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-11 23:38:17 -05:00
Sunil Goutham	40fb5f8a60	net: thunderx: HW TSO support for pass-2 hardware This adds support for offloading TCP segmentation to HW in pass-2 revision of hardware. Both driver level SW TSO for pass1.x chips and HW TSO for pass-2 chip will co-exist. Modified SQ descriptor structures to reflect pass-2 hw implementation. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-11 23:38:17 -05:00
Sunil Goutham	668dda06d4	net, thunderx: Remove unnecessary rcv buffer start address management Since we have moved on to using allocated pages to carve receive buffers instead of netdev_alloc_skb() there is no need to store any pointers for later retrieval. Earlier we had to store skb and skb->data pointers which later are used to handover received packet to network stack. This will avoid an unnecessary cache miss as well. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2015-12-07 13:40:01 -05:00

1 2

64 Commits