linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Jakub Kicinski	cc54dc2804	nfp: abm: create project-specific vNIC structure ABM NIC requires more complex vNIC handling, allocate per-vNIC structure. Find out RX queue base and PCI PF id. There will be multiple PFs sharing the same MAC port, therefore the MAC address assigned to the vNIC must be looked up in the HWInfo database. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	c4c8f39a57	nfp: abm: add initial active buffer management NIC skeleton Add a very rudimentary active buffer management NIC support. For now it's like a core NIC without SR-IOV support. Next commits will extend its functionality. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	b586c77b3c	nfp: core: allow 4-byte aligned accesses to Memory Units Current code doesn't enforce length requirements on 32bit accesses with action NFP_CPP_ACTION_RW to memory units, but if the access is only aligned to 4 bytes as well we will fall into the explicit access case and error out. Such accesses are correct, allow them by lowering the width earlier. While at it use a switch statement to improve readability. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	a0d163f432	nfp: add shared buffer configuration Allow app FW to advertise its shared buffer pool information. Use the per-PF mailbox to configure them from devlink. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	0c693323a1	nfp: add support for per-PCI PF mailbox When working with devlink-related functionality for locking reasons it's easier to create a new mailbox per-PCI PF device than try to use one of the netdev/vNIC mailboxes. Define new mailbox structure and resolve its symbol during probe. For forward compatibility allow silent truncation of mailbox command data. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
Jakub Kicinski	8f6196f63c	nfp: move rtsym helpers to pf code nfp_net_pf_rtsym_read_optional() and nfp_net_pf_map_rtsym() are not really related to networking code. Move them to the PF code and remove the net from their names. They will soon be needed by code outside of nfp_net_main.c anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-23 14:26:18 -04:00
David S. Miller	6f6e434aa2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net S390 bpf_jit.S is removed in net-next and had changes in 'net', since that code isn't used any more take the removal. TLS data structures split the TX and RX components in 'net-next', put the new struct members from the bug fix in 'net' into the RX part. The 'net-next' tree had some reworking of how the ERSPAN code works in the GRE tunneling code, overlapping with a one-line headroom calculation fix in 'net'. Overlapping changes in __sock_map_ctx_update_elem(), keep the bits that read the prog members via READ_ONCE() into local variables before using them. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-21 16:01:54 -04:00
Jiri Pirko	5ec1380a21	devlink: extend attrs_set for setting port flavours Devlink ports can have specific flavour according to the purpose of use. This patch extend attrs_set so the driver can say which flavour port has. Initial flavours are: physical, cpu, dsa User can query this to see right away what is the purpose of each port. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-19 16:30:39 -04:00
Jiri Pirko	b9ffcbaf56	devlink: introduce devlink_port_attrs_set Change existing setter for split port information into more generic attrs setter. Alongside with that, allow to set port number and subport number for split ports. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-19 16:30:39 -04:00
Jiong Wang	c217abccaa	nfp: bpf: support arithmetic indirect right shift (BPF_ARSH \| BPF_X) Code logic is similar with arithmetic right shift by constant, and NFP get indirect shift amount through source A operand of PREV_ALU. It is possible to fall back to logic right shift if the MSB is known to be zero from range info, however there is no benefit to do this given logic indirect right shift use the same number and cycle of instruction sequence. Suppose the MSB of regX is the bit we want to replicate to fill in all the vacant positions, and regY contains the shift amount, then we could use single instruction to set up both. [alu, --, regY, OR, regX] -- NOTE: the PREV_ALU result doesn't need to write to any destination register. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-18 21:35:55 +02:00
Jiong Wang	f43d0f17fe	nfp: bpf: support arithmetic right shift by constant (BPF_ARSH \| BPF_K) Code logic is similar with logic right shift except we also need to set PREV_ALU result properly, the MSB of which is the bit that will be replicated to fill in all the vacant positions. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-18 21:35:55 +02:00
Jiong Wang	991f5b3651	nfp: bpf: support logic indirect shifts (BPF_[L\|R]SH \| BPF_X) For indirect shifts, shift amount is not specified as constant, NFP needs to get the shift amount through the low 5 bits of source A operand in PREV_ALU, therefore extra instructions are needed compared with shifts by constants. Because NFP is 32-bit, so we are using register pair for 64-bit shifts and therefore would need different instruction sequences depending on whether shift amount is less than 32 or not. NFP branch-on-bit-test instruction emitter is added by this patch and is used for efficient runtime check on shift amount. We'd think the shift amount is less than 32 if bit 5 is clear and greater or equal than 32 otherwise. Shift amount is greater than or equal to 64 will result in undefined behavior. This patch also use range info to avoid generating unnecessary runtime code if we are certain shift amount is less than 32 or not. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-18 21:35:54 +02:00
Jiri Pirko	3b734ff604	nfp: flower: fix error path during representor creation Don't store repr pointer to reprs array until the representor is successfully created. This avoids message about "representor destruction" even when it was never created. Also it cleans-up the flow. Also, check return value after port alloc. Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-17 16:23:29 -04:00
David S. Miller	b9f672af14	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-05-17 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Provide a new BPF helper for doing a FIB and neighbor lookup in the kernel tables from an XDP or tc BPF program. The helper provides a fast-path for forwarding packets. The API supports IPv4, IPv6 and MPLS protocols, but currently IPv4 and IPv6 are implemented in this initial work, from David (Ahern). 2) Just a tiny diff but huge feature enabled for nfp driver by extending the BPF offload beyond a pure host processing offload. Offloaded XDP programs are allowed to set the RX queue index and thus opening the door for defining a fully programmable RSS/n-tuple filter replacement. Once BPF decided on a queue already, the device data-path will skip the conventional RSS processing completely, from Jakub. 3) The original sockmap implementation was array based similar to devmap. However unlike devmap where an ifindex has a 1:1 mapping into the map there are use cases with sockets that need to be referenced using longer keys. Hence, sockhash map is added reusing as much of the sockmap code as possible, from John. 4) Introduce BTF ID. The ID is allocatd through an IDR similar as with BPF maps and progs. It also makes BTF accessible to user space via BPF_BTF_GET_FD_BY_ID and adds exposure of the BTF data through BPF_OBJ_GET_INFO_BY_FD, from Martin. 5) Enable BPF stackmap with build_id also in NMI context. Due to the up_read() of current->mm->mmap_sem build_id cannot be parsed. This work defers the up_read() via a per-cpu irq_work so that at least limited support can be enabled, from Song. 6) Various BPF JIT follow-up cleanups and fixups after the LD_ABS/LD_IND JIT conversion as well as implementation of an optimized 32/64 bit immediate load in the arm64 JIT that allows to reduce the number of emitted instructions; in case of tested real-world programs they were shrinking by three percent, from Daniel. 7) Add ifindex parameter to the libbpf loader in order to enable BPF offload support. Right now only iproute2 can load offloaded BPF and this will also enable libbpf for direct integration into other applications, from David (Beckett). 8) Convert the plain text documentation under Documentation/bpf/ into RST format since this is the appropriate standard the kernel is moving to for all documentation. Also add an overview README.rst, from Jesper. 9) Add __printf verification attribute to the bpf_verifier_vlog() helper. Though it uses va_list we can still allow gcc to check the format string, from Mathieu. 10) Fix a bash reference in the BPF selftest's Makefile. The '\|& ...' is a bash 4.0+ feature which is not guaranteed to be available when calling out to shell, therefore use a more portable variant, from Joe. 11) Fix a 64 bit division in xdp_umem_reg() by using div_u64() instead of relying on the gcc built-in, from Björn. 12) Fix a sock hashmap kmalloc warning reported by syzbot when an overly large key size is used in hashmap then causing overflows in htab->elem_size. Reject bogus attr->key_size early in the sock_hash_alloc(), from Yonghong. 13) Ensure in BPF selftests when urandom_read is being linked that --build-id is always enabled so that test_stacktrace_build_id[_nmi] won't be failing, from Alexei. 14) Add bitsperlong.h as well as errno.h uapi headers into the tools header infrastructure which point to one of the arch specific uapi headers. This was needed in order to fix a build error on some systems for the BPF selftests, from Sirio. 15) Allow for short options to be used in the xdp_monitor BPF sample code. And also a bpf.h tools uapi header sync in order to fix a selftest build failure. Both from Prashant. 16) More formally clarify the meaning of ID in the direct packet access section of the BPF documentation, from Wang. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-16 22:47:11 -04:00
David S. Miller	9d6b4bfb59	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-05-14 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix nfp to allow zero-length BPF capabilities, meaning the nfp capability parsing loop will otherwise exit early if the last capability is zero length and therefore driver will fail to probe with an error such as: nfp: BPF capabilities left after parsing, parsed:92 total length:100 nfp: invalid BPF capabilities at offset:92 Fix from Jakub. 2) libbpf's bpf_object__open() may return IS_ERR_OR_NULL() and not just an error. Fix libbpf's bpf_prog_load_xattr() to handle that case as well, also from Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-13 21:07:02 -04:00
David S. Miller	b2d6cee117	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net The bpf syscall and selftests conflicts were trivial overlapping changes. The r8169 change involved moving the added mdelay from 'net' into a different function. A TLS close bug fix overlapped with the splitting of the TLS state into separate TX and RX parts. I just expanded the tests in the bug fix from "ctx->conf == X" into "ctx->tx_conf == X && ctx->rx_conf == X". Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-11 20:53:22 -04:00
Pieter Jansen van Vuuren	df13c59b54	nfp: flower: remove headroom from max MTU calculation Since commit `29a5dcae27` ("nfp: flower: offload phys port MTU change") we take encapsulation headroom into account when calculating the max allowed MTU. This is unnecessary as the max MTU advertised by firmware should have already accounted for encap headroom. Subtracting headroom twice brings the max MTU below what's necessary for some deployments. Fixes: `29a5dcae27` ("nfp: flower: offload phys port MTU change") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-10 15:28:01 -04:00
Jakub Kicinski	26aeb9daa0	nfp: bpf: allow zero-length capabilities Some BPF capabilities carry no value, they simply indicate feature is present. Our capability parsing loop will exit early if last capability is zero-length because it's looking for more than 8 bytes of data (8B is our TLV header length). Allow the last capability to be zero-length. This bug would lead to driver failing to probe with the following error if the last capability FW advertises is zero-length: nfp: BPF capabilities left after parsing, parsed:92 total length:100 nfp: invalid BPF capabilities at offset:92 Note the "parsed" and "length" values are 8 apart. No shipping FW runs into this issue, but we can't guarantee that will remain the case. Fixes: `77a844ee65` ("nfp: bpf: prepare for parsing BPF FW capabilities") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-09 18:17:17 +02:00
Jakub Kicinski	d985888faa	nfp: bpf: support setting the RX queue index BPF has access to all internal FW datapath structures. Including the structure containing RX queue selection. With little coordination with the datapath we can let the offloaded BPF select the RX queue. We just need a way to tell the datapath that queue selection has already been done and it shouldn't overwrite it. Define a bit to tell datapath BPF already selected a queue (QSEL_SET), if the selected queue is not enabled (>= number of enabled queues) datapath will perform normal RSS. BPF queue selection on the NIC can be used to replace standard datapath RSS with fully programmable BPF/XDP RSS. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-09 18:04:37 +02:00
David S. Miller	01adc4851a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Minor conflict, a CHECK was placed into an if() statement in net-next, whilst a newline was added to that CHECK call in 'net'. Thanks to Daniel for the merge resolution. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-07 23:35:08 -04:00
Jakub Kicinski	b4264c96b5	nfp: bpf: rewrite map pointers with NFP TIDs Kernel will now replace map fds with actual pointer before calling the offload prepare. We can identify those pointers and replace them with NFP table IDs instead of loading the table ID in code generated for CALL instruction. This allows us to support having the same CALL being used with different maps. Since we don't want to change the FW ABI we still need to move the TID from R1 to portion of R0 before the jump. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-04 23:41:03 +02:00
Jakub Kicinski	9816dd35ec	nfp: bpf: perf event output helpers support Add support for the perf_event_output family of helpers. The implementation on the NFP will not match the host code exactly. The state of the host map and rings is unknown to the device, hence device can't return errors when rings are not installed. The device simply packs the data into a firmware notification message and sends it over to the host, returning success to the program. There is no notion of a host CPU on the device when packets are being processed. Device will only offload programs which set BPF_F_CURRENT_CPU. Still, if map index doesn't match CPU no error will be returned (see above). Dropped/lost firmware notification messages will not cause "lost events" event on the perf ring, they are only visible via device error counters. Firmware notification messages may also get reordered in respect to the packets which caused their generation. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-04 23:41:03 +02:00
Jakub Kicinski	630a4d3874	nfp: bpf: record offload neutral maps in the driver For asynchronous events originating from the device, like perf event output, we need to be able to make sure that objects being referred to by the FW message are valid on the host. FW events can get queued and reordered. Even if we had a FW message "barrier" we should still protect ourselves from bogus FW output. Add a reverse-mapping hash table and record in it all raw map pointers FW may refer to. Only record neutral maps, i.e. perf event arrays. These are currently the only objects FW can refer to. Use RCU protection on the read side, update side is under RTNL. Since program vs map destruction order is slightly painful for offload simply take an extra reference on all the recorded maps to make sure they don't disappear. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-05-04 23:41:03 +02:00
David S. Miller	a7b15ab887	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Overlapping changes in selftests Makefile. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-04 09:58:56 -04:00
John Hurley	50a5852a65	nfp: flower: set tunnel ttl value to net default Firmware requires that the ttl value for an encapsulating ipv4 tunnel header be included as an action field. Prior to the support of Geneve tunnel encap (when ttl set was removed completely), ttl value was extracted from the tunnel key. However, tests have shown that this can still produce a ttl of 0. Fix the issue by setting the namespace default value for each new tunnel. Follow up patch for net-next will do a full route lookup. Fixes: `3ca3059dc3` ("nfp: flower: compile Geneve encap actions") Fixes: `b27d6a95a7` ("nfp: compile flower vxlan tunnel set actions") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-05-01 18:59:57 -04:00
Jakub Kicinski	c55ca688ed	nfp: don't depend on eth_tbl being available For very very old generation of the management FW Ethernet port information table may theoretically not be available. This in turn will cause the nfp_port structures to not be allocated. Make sure we don't crash the kernel when there is no eth_tbl: RIP: 0010:nfp_net_pci_probe+0xf2/0xb40 [nfp] ... Call Trace: nfp_pci_probe+0x6de/0xab0 [nfp] local_pci_probe+0x47/0xa0 work_for_cpu_fn+0x1a/0x30 process_one_work+0x1de/0x3e0 Found while working with broken/development version of management FW. Fixes: `a5950182c0` ("nfp: map mac_stats and vf_cfg BARs") Fixes: `93da7d9660` ("nfp: provide nfp_port to of nfp_net_get_mac_addr()") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-27 11:15:10 -04:00
David S. Miller	79741a38b4	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-04-27 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add extensive BPF helper description into include/uapi/linux/bpf.h and a new script bpf_helpers_doc.py which allows for generating a man page out of it. Thus, every helper in BPF now comes with proper function signature, detailed description and return code explanation, from Quentin. 2) Migrate the BPF collect metadata tunnel tests from BPF samples over to the BPF selftests and further extend them with v6 vxlan, geneve and ipip tests, simplify the ipip tests, improve documentation and convert to bpf_ntoh() / bpf_hton() api, from William. 3) Currently, helpers that expect ARG_PTR_TO_MAP_{KEY,VALUE} can only access stack and packet memory. Extend this to allow such helpers to also use map values, which enabled use cases where value from a first lookup can be directly used as a key for a second lookup, from Paul. 4) Add a new helper bpf_skb_get_xfrm_state() for tc BPF programs in order to retrieve XFRM state information containing SPI, peer address and reqid values, from Eyal. 5) Various optimizations in nfp driver's BPF JIT in order to turn ADD and SUB instructions with negative immediate into the opposite operation with a positive immediate such that nfp can better fit small immediates into instructions. Savings in instruction count up to 4% have been observed, from Jakub. 6) Add the BPF prog's gpl_compatible flag to struct bpf_prog_info and add support for dumping this through bpftool, from Jiri. 7) Move the BPF sockmap samples over into BPF selftests instead since sockmap was rather a series of tests than sample anyway and this way this can be run from automated bots, from John. 8) Follow-up fix for bpf_adjust_tail() helper in order to make it work with generic XDP, from Nikita. 9) Some follow-up cleanups to BTF, namely, removing unused defines from BTF uapi header and renaming 'name' struct btf_* members into name_off to make it more clear they are offsets into string section, from Martin. 10) Remove test_sock_addr from TEST_GEN_PROGS in BPF selftests since not run directly but invoked from test_sock_addr.sh, from Yonghong. 11) Remove redundant ret assignment in sample BPF loader, from Wang. 12) Add couple of missing files to BPF selftest's gitignore, from Anders. There are two trivial merge conflicts while pulling: 1) Remove samples/sockmap/Makefile since all sockmap tests have been moved to selftests. 2) Add both hunks from tools/testing/selftests/bpf/.gitignore to the file since git should ignore all of them. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-26 21:19:50 -04:00
John Hurley	c50647d3e8	nfp: flower: ignore duplicate cb requests for same rule If a flower rule has a repr both as ingress and egress port then 2 callbacks may be generated for the same rule request. Add an indicator to each flow as to whether or not it was added from an ingress registered cb. If so then ignore add/del/stat requests to it from an egress cb. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-25 14:07:04 -04:00
John Hurley	54a4a03439	nfp: flower: support offloading multiple rules with same cookie When multiple netdevs are attached to a tc offload block and register for callbacks, a rule added to the block will be propogated to all netdevs. Previously these were detected as duplicates (based on cookie) and rejected. Modify the rule nfp lookup function to optionally include an ingress netdev and a host context along with the cookie value when searching for a rule. When a new rule is passed to the driver, the netdev the rule is to be attached to is considered when searching for dublicates. When a stats update is received from HW, the host context is used alongside the cookie to map to the correct host rule. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-25 14:07:04 -04:00
Jakub Kicinski	dd92a7d1af	nfp: print PCIe link bandwidth on probe To aid debugging of performance issues caused by limited PCIe bandwidth print the PCIe link information on probe. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-25 14:07:04 -04:00
Jakub Kicinski	3e3e9fd8b6	nfp: reset local locks on init NFP locks record the owner when held, for PCIe devices the owner ID will be the PCIe link number. When driver loads it should scan known locks and if they indicate that they are held by local endpoint but the driver doesn't hold them - release them. Locks can be left taken for instance when kernel gets kexec-ed or after a crash. Management FW tries to clean up stale locks too, but it currently depends on PCIe link going down which doesn't always happen. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-25 14:07:04 -04:00
Jakub Kicinski	7bdc97be90	nfp: bpf: optimize comparisons to negative constants Comparison instruction requires a subtraction. If the constant is negative we are more likely to fit it into a NFP instruction directly if we change the sign and use addition. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-25 09:56:10 +02:00
Jakub Kicinski	61dd8f0007	nfp: bpf: tabularize generations of compare operations There are quite a few compare instructions now, use a table to translate BPF instruction code to NFP instruction parameters instead of parameterizing helpers. This saves LOC and makes future extensions easier. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-25 09:56:10 +02:00
Jakub Kicinski	6c59500c2d	nfp: bpf: optimize add/sub of a negative constant NFP instruction set can fit small immediates into the instruction. Negative integers, however, will never fit because they will have highest bit set. If we swap the ALU op between ADD and SUB and negate the constant we have a better chance of fitting small negative integers into the instruction itself and saving one or two cycles. immed[gprB_21, 0xfffffffc] alu[gprA_4, gprA_4, +, gprB_21], gpr_wrboth immed[gprB_21, 0xffffffff] alu[gprA_5, gprA_5, +carry, gprB_21], gpr_wrboth now becomes: alu[gprA_4, gprA_4, -, 4], gpr_wrboth alu[gprA_5, gprA_5, -carry, 0], gpr_wrboth Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-25 09:56:10 +02:00
Jakub Kicinski	9c9e53233c	nfp: bpf: remove double space Whitespace cleanup - remove double space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-25 09:56:10 +02:00
David S. Miller	e0ada51db9	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflicts were simple overlapping changes in microchip driver. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-21 16:32:48 -04:00
Nikita V. Shirokov	5a6a22e378	bpf: make netronome nfp compatible w/ bpf_xdp_adjust_tail w/ bpf_xdp_adjust_tail helper xdp's data_end pointer could be changed as well (only "decrease" of pointer's location is going to be supported). changing of this pointer will change packet's size. for nfp driver we will just calculate packet's length unconditionally Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-04-18 23:34:16 +02:00
Pieter Jansen van Vuuren	cf2cbadc20	nfp: flower: split and limit cmsg skb lists Introduce a second skb list for handling control messages and limit the number of allowed messages. Some control messages are considered more crucial than others, resulting in the need for a second skb list. By splitting the list into a separate high and low priority list we can ensure that messages on the high list get added to the head of the list that gets processed, this however has no functional impact. Previously there was no limit on the number of messages allowed on the queue, this could result in the queue growing boundlessly and eventually the host running out of memory. Fixes: `b985f870a5` ("nfp: process control messages in workqueue in flower app") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-12 21:57:28 -04:00
Pieter Jansen van Vuuren	0b1a989ef5	nfp: flower: move route ack control messages out of the workqueue Previously we processed the route ack control messages in the workqueue, this unnecessarily loads the workqueue. We can deal with these messages sooner as we know we are going to drop them. Fixes: `8e6a9046b6` ("nfp: flower vxlan neighbour offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-12 21:57:28 -04:00
Jakub Kicinski	bc05f9bcd8	nfp: print a message when mutex wait is interrupted When waiting for an NFP mutex is interrupted print a message to make root causing later error messages easier. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-12 21:57:27 -04:00
Jakub Kicinski	5496295aef	nfp: ignore signals when communicating with management FW We currently allow signals to interrupt the wait for management FW commands. Exiting the wait should not cause trouble, the FW will just finish executing the command in the background and new commands will wait for the old one to finish. However, this may not be what users expect (Ctrl-C not actually stopping the command). Moreover some systems routinely request link information with signals pending (Ubuntu 14.04 runs a landscape-sysinfo python tool from MOTD) worrying users with errors like these: nfp 0000:04:00.0: nfp_nsp: Error -512 waiting for code 0x0007 to start nfp 0000:04:00.0: nfp: reading port table failed -512 Make the wait for management FW responses non-interruptible. Fixes: `1a64821c6a` ("nfp: add support for service processor access") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-12 21:57:27 -04:00
Dirk van der Merwe	1489bbd10e	nfp: use full 40 bits of the NSP buffer address The NSP default buffer is a piece of NFP memory where additional command data can be placed. Its format has been copied from host buffer, but the PCIe selection bits do not make sense in this case. If those get masked out from a NFP address - writes to random place in the chip memory may be issued and crash the device. Even in the general NSP buffer case, it doesn't make sense to have the PCIe selection bits there anymore. These are unused at the moment, and when it becomes necessary, the PCIe selection bits should rather be moved to another register to utilise more bits for the buffer address. This has never been an issue because the buffer used to be allocated in memory with less-than-38-bit-long address but that is about to change. Fixes: `1a64821c6a` ("nfp: add support for service processor access") Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-04 11:45:24 -04:00
Jakub Kicinski	0df57e604c	nfp: add a separate counter for packets with CHECKSUM_COMPLETE We are currently counting packets with CHECKSUM_COMPLETE as "hw_rx_csum_ok". This is confusing. Add a new counter. To make sure it fits in the same cacheline move the less used error counter to a different location. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-04 11:36:50 -04:00
David S. Miller	c0b458a946	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Minor conflicts in drivers/net/ethernet/mellanox/mlx5/core/en_rep.c, we had some overlapping changes: 1) In 'net' MLX5E_PARAMS_LOG_{SQ,RQ}_SIZE --> MLX5E_REP_PARAMS_LOG_{SQ,RQ}_SIZE 2) In 'net-next' params->log_rq_size is renamed to be params->log_rq_mtu_frames. 3) In 'net-next' params->hard_mtu is added. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-04-01 19:49:34 -04:00
David S. Miller	d4069fe6fc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-03-31 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add raw BPF tracepoint API in order to have a BPF program type that can access kernel internal arguments of the tracepoints in their raw form similar to kprobes based BPF programs. This infrastructure also adds a new BPF_RAW_TRACEPOINT_OPEN command to BPF syscall which returns an anon-inode backed fd for the tracepoint object that allows for automatic detach of the BPF program resp. unregistering of the tracepoint probe on fd release, from Alexei. 2) Add new BPF cgroup hooks at bind() and connect() entry in order to allow BPF programs to reject, inspect or modify user space passed struct sockaddr, and as well a hook at post bind time once the port has been allocated. They are used in FB's container management engine for implementing policy, replacing fragile LD_PRELOAD wrapper intercepting bind() and connect() calls that only works in limited scenarios like glibc based apps but not for other runtimes in containerized applications, from Andrey. 3) BPF_F_INGRESS flag support has been added to sockmap programs for their redirect helper call bringing it in line with cls_bpf based programs. Support is added for both variants of sockmap programs, meaning for tx ULP hooks as well as recv skb hooks, from John. 4) Various improvements on BPF side for the nfp driver, besides others this work adds BPF map update and delete helper call support from the datapath, JITing of 32 and 64 bit XADD instructions as well as offload support of bpf_get_prandom_u32() call. Initial implementation of nfp packet cache has been tackled that optimizes memory access (see merge commit for further details), from Jakub and Jiong. 5) Removal of struct bpf_verifier_env argument from the print_bpf_insn() API has been done in order to prepare to use print_bpf_insn() soon out of perf tool directly. This makes the print_bpf_insn() API more generic and pushes the env into private data. bpftool is adjusted as well with the print_bpf_insn() argument removal, from Jiri. 6) Couple of cleanups and prep work for the upcoming BTF (BPF Type Format). The latter will reuse the current BPF verifier log as well, thus bpf_verifier_log() is further generalized, from Martin. 7) For bpf_getsockopt() and bpf_setsockopt() helpers, IPv4 IP_TOS read and write support has been added in similar fashion to existing IPv6 IPV6_TCLASS socket option we already have, from Nikita. 8) Fixes in recent sockmap scatterlist API usage, which did not use sg_init_table() for initialization thus triggering a BUG_ON() in scatterlist API when CONFIG_DEBUG_SG was enabled. This adds and uses a small helper sg_init_marker() to properly handle the affected cases, from Prashant. 9) Let the BPF core follow IDR code convention and therefore use the idr_preload() and idr_preload_end() helpers, which would also help idr_alloc_cyclic() under GFP_ATOMIC to better succeed under memory pressure, from Shaohua. 10) Last but not least, a spelling fix in an error message for the BPF cookie UID helper under BPF sample code, from Colin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-31 23:33:04 -04:00
John Hurley	29a5dcae27	nfp: flower: offload phys port MTU change Trigger a port mod message to request an MTU change on the NIC when any physical port representor is assigned a new MTU value. The driver waits 10 msec for an ack that the FW has set the MTU. If no ack is received the request is rejected and an appropriate warning flagged. Rather than maintain an MTU queue per repr, one is maintained per app. Because the MTU ndo is protected by the rtnl lock, there can never be contention here. Portmod messages from the NIC are also protected by rtnl so we first check if the portmod is an ack and, if so, handle outside rtnl and the cmsg work queue. Acks are detected by the marking of a bit in a portmod response. They are then verfied by checking the port number and MTU value expected by the app. If the expected MTU is 0 then no acks are currently expected. Also, ensure that the packet headroom reserved by the flower firmware is considered when accepting an MTU change on any repr. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-30 10:18:55 -04:00
John Hurley	167cebeffa	nfp: modify app MTU setting callbacks Rename the 'change_mtu' app callback to 'check_mtu'. This is called whenever an MTU change is requested on a netdev. It can reject the change but is not responsible for implementing it. Introduce a new 'repr_change_mtu' app callback that is hit when the MTU of a repr is to be changed. This is responsible for performing the MTU change and verifying it. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-30 10:18:54 -04:00
Jakub Kicinski	7c095f5d9b	nfp: bpf: improve wrong FW response warnings When FW responds with a message of wrong size or type make sure the type is checked first and included in the wrong size message. This makes it easier to figure out which FW command failed. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:14 -07:00
Jakub Kicinski	df4a37d8b5	nfp: bpf: add support for bpf_get_prandom_u32() NFP has a prng register, which we can read to obtain a u32 worth of pseudo random data. Generate code for it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:14 -07:00
Jakub Kicinski	41aed09cf6	nfp: bpf: add support for atomic add of unknown values Allow atomic add to be used even when the value is not guaranteed to fit into a 16 bit immediate. This requires the value to be pulled as data, and therefore use of a transfer register and a context swap. Track the information about possible lengths of the value, if it's guaranteed to be larger than 16bits don't generate the code for the optimized case at all. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:14 -07:00
Jakub Kicinski	b556ddd9c1	nfp: bpf: expose command delay slots Allow callers to control the delay slots of commands, instead of giving them just a wait/nowait choice. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:14 -07:00
Jakub Kicinski	dcb0c27f3c	nfp: bpf: add basic support for atomic adds Implement atomic add operation for 32 and 64 bit values. Depend on the verifier to ensure alignment. Values have to be kept in big endian and swapped upon read/write. For now only support atomic add of a constant. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	bfee64deaa	nfp: bpf: add map deletes from the datapath Support calling map_delete_elem() FW helper from the datapath programs. For JIT checks and code are basically equivalent to map lookups. Similarly to other map helper key must be on the stack. Different pointer types are left for future extension. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	44d65a47ae	nfp: bpf: add map updates from the datapath Support calling map_update_elem() from the datapath programs by calling into FW-provided helper. Value pointer is passed in LM pointer #2. Keeping track of old state for arg3 is not necessary, since LM pointer #2 will be always loaded in this case, the trivial optimization for value at the bottom of the stack can't be done here. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	289c5b7630	nfp: bpf: add helper for basic map call checks Add a verifier helper for performing the basic state checks before a call to a map helper. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	2f46e0c127	nfp: bpf: add helper for validating stack pointers Our implementation has restriction on stack pointers for function calls. Move the common checks into a helper for reuse. The state has to be encapsulated into a structure to support parameters other than BPF_REG_2. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jakub Kicinski	fc4484970e	nfp: bpf: rename map_lookup_stack() to map_call_stack_common() We will reuse most of map call code gen for other map calls. Rename the lookup gen function and use meta->func_id instead of hard-coding lookup. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:13 -07:00
Jiong Wang	87b10ecdce	nfp: bpf: detect packet reads could be cached, enable the optimisation This patch is the front end of this optimisation, it detects and marks those packet reads that could be cached. Then the optimisation "backend" will be activated automatically. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:12 -07:00
Jiong Wang	91ff69e840	nfp: bpf: support unaligned read offset This patch add the support for unaligned read offset, i.e. the read offset to the start of packet cache area is not aligned to REG_WIDTH. In this case, the read area might across maximum three transfer-in registers. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:12 -07:00
Jiong Wang	be75923786	nfp: bpf: read from packet data cache for PTR_TO_PACKET This patch assumes there is a packet data cache, and would try to read packet data from the cache instead of from memory. This patch only implements the optimisation "backend", it doesn't build the packet data cache, so this optimisation is not enabled. This patch has only enabled aligned packet data read, i.e. when the read offset to the start of cache is REG_WIDTH aligned. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-28 19:36:12 -07:00
Pieter Jansen van Vuuren	71ea5343a0	nfp: flower: implement ip fragmentation match offload Implement ip fragmentation match offloading for both IPv4 and IPv6. Allows offloading frag, nofrag, first and nofirstfrag classification. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-26 13:01:09 -04:00
Pieter Jansen van Vuuren	07e1671cfc	nfp: flower: refactor shared ip header in match offload Refactored shared ip header code for IPv4 and IPv6 in match offload. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-26 13:01:09 -04:00
Joe Perches	d3757ba4c1	ethernet: Use octal not symbolic permissions Prefer the direct use of octal for permissions. Done with checkpatch -f --types=SYMBOLIC_PERMS --fix-inplace and some typing. Miscellanea: o Whitespace neatening around these conversions. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-03-26 12:07:49 -04:00
Jakub Kicinski	e8a4796ee2	nfp: bpf: fix check of program max insn count NFP program allocation length is in bytes and NFP program length is in instructions, fix the comparison of the two. Fixes: `9314c442d7` ("nfp: bpf: move translation prepare to offload.c") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-03-24 10:41:24 -07:00
Dirk van der Merwe	70271dadee	nfp: advertise firmware for mixed 10G/25G mode The AMDA0099-0001 platform can support the 1x10G + 1x25G mixed mode operation. Recently, firmware has been added for this configuration mode. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-22 15:22:50 -05:00
Jakub Kicinski	f7308991bf	nfp: add Makefiles to all directories To be able to build separate objects we need to provide Kbuild with a Makefile in each directory. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-22 15:22:50 -05:00
Pieter Jansen van Vuuren	ffa61202fe	nfp: flower: implement tcp flag match offload Implement tcp flag match offloading. Current tcp flag match support include FIN, SYN, RST, PSH and URG flags, other flags are unsupported. The PSH and URG flags are only set in the hardware fast path when used in combination with the SYN, RST and PSH flags. Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-16 16:24:24 -05:00
Michael Rapson	014f900898	nfp: standardize FW header whitespace The nfp_net_ctrl.h file used spaces for indentation in the past but tabs have crept in. Host driver files use tabs for indentation by default, so let's convert to tabs for consistency across the file and our drivers. Signed-off-by: Michael Rapson <michael.rapson@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-16 16:24:24 -05:00
David S. Miller	437a4db66d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-02-09 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Two fixes for BPF sockmap in order to break up circular map references from programs attached to sockmap, and detaching related sockets in case of socket close() event. For the latter we get rid of the smap_state_change() and plug into ULP infrastructure, which will later also be used for additional features anyway such as TX hooks. For the second issue, dependency chain is broken up via map release callback to free parse/verdict programs, all from John. 2) Fix a libbpf relocation issue that was found while implementing XDP support for Suricata project. Issue was that when clang was invoked with default target instead of bpf target, then various other e.g. debugging relevant sections are added to the ELF file that contained relocation entries pointing to non-BPF related sections which libbpf trips over instead of skipping them. Test cases for libbpf are added as well, from Jesper. 3) Various misc fixes for bpftool and one for libbpf: a small addition to libbpf to make sure it recognizes all standard section prefixes. Then, the Makefile in bpftool/Documentation is improved to explicitly check for rst2man being installed on the system as we otherwise risk installing empty man pages; the man page for bpftool-map is corrected and a set of missing bash completions added in order to avoid shipping bpftool where the completions are only partially working, from Quentin. 4) Fix applying the relocation to immediate load instructions in the nfp JIT which were missing a shift, from Jakub. 5) Two fixes for the BPF kernel selftests: handle CONFIG_BPF_JIT_ALWAYS_ON=y gracefully in test_bpf.ko module and mark them as FLAG_EXPECTED_FAIL in this case; and explicitly delete the veth devices in the two tests test_xdp_{meta,redirect}.sh before dismantling the netnses as when selftests are run in batch mode, then workqueue to handle destruction might not have finished yet and thus veth creation in next test under same dev name would fail, from Yonghong. 6) Fix test_kmod.sh to check the test_bpf.ko module path before performing an insmod, and fallback to modprobe. Especially the latter is useful when having a device under test that has the modules installed instead, from Naresh. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-09 14:05:10 -05:00
Jakub Kicinski	1a5e8e3500	nfp: populate MODULE_VERSION DKMS and similar out-of-tree module replacement services use module version to make sure the out-of-tree software is not older than the module shipped with the kernel. We use the kernel version in ethtool -i output, put it into MODULE_VERSION as well. Reported-by: Jan Gutter <jan.gutter@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	0d592e52fb	nfp: limit the number of TSO segments Most FWs limit the number of TSO segments a frame can produce to 64. This is for fairness and efficiency (of FW datapath) reasons. If a frame with larger number of segments is submitted the FW will drop it. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	d692403e5c	nfp: forbid disabling hw-tc-offload on representors while offload active All netdevs which can accept TC offloads must implement .ndo_set_features(). nfp_reprs currently do not do that, which means hw-tc-offload can be turned on and off even when offloads are active. Whether the offloads are active is really a question to nfp_ports, so remove the per-app tc_busy callback indirection thing, and simply count the number of offloaded items in nfp_port structure. Fixes: `8a2768732a` ("nfp: provide infrastructure for offloading flower based TC filters") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	0b9de4ca85	nfp: don't advertise hw-tc-offload on non-port netdevs nfp_port is a structure which represents an ASIC port, both PCIe vNIC (on a PF or a VF) or the external MAC port. vNIC netdev (struct nfp_net) and pure representor netdev (struct nfp_repr) both have a pointer to this structure. nfp_reprs always have a port associated. nfp_nets, however, only represent a device port in legacy mode, where they are considered the MAC port. In switchdev mode they are just the CPU's side of the PCIe link. By definition TC offloads only apply to device ports. Don't set the flag on vNICs without a port (i.e. in switchdev mode). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	e3ac6c0737	nfp: bpf: require ETH table Upcoming changes will require all netdevs supporting TC offloads to have a full struct nfp_port. Require those for BPF offload. The operation without management FW reporting information about Ethernet ports is something we only support for very old and very basic NIC firmwares anyway. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Tested-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-08 10:01:27 -05:00
Jakub Kicinski	b7d9923547	nfp: bpf: fix immed relocation for larger offsets Immed relocation is missing a shift which means for larger offsets the lower and higher part of the address would be ORed together. Fixes: `ce4ebfd859` ("nfp: bpf: add helpers for updating immediate instructions") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-02-08 11:59:50 +01:00
Jakub Kicinski	703f578a35	nfp: fix kdoc warnings on nested structures Commit `84ce5b9877` ("scripts: kernel-doc: improve nested logic to handle multiple identifiers") improved the handling of nested structure definitions in scripts/kernel-doc, and changed the expected format of documentation. This causes new warnings to appear on W=1 builds. Only comment changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-06 11:43:58 -05:00
Edwin Peer	1d8ef0c076	nfp: fix TLV offset calculation The data pointer in the config space TLV parser already includes NFP_NET_CFG_TLV_BASE, it should not be added again. Incorrect offset values were only used in printed user output, rendering the bug merely cosmetic. Fixes: `73a0329b05` ("nfp: add TLV capabilities to the BAR") Signed-off-by: Edwin Peer <edwin.peer@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-02-02 19:08:20 -05:00
Jakub Kicinski	3107fdc8b2	nfp: use tc_cls_can_offload_and_chain0() Make use of tc_cls_can_offload_and_chain0() to set extack msg in case ethtool tc offload flag is not set or chain unsupported. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 21:23:08 -05:00
Wei Yongjun	e58decc9c5	nfp: fix error return code in nfp_pci_probe() Fix to return error code -EINVAL instead of 0 when num_vfs above limit_vfs, as done elsewhere in this function. Fixes: `0dc7862191` ("nfp: handle SR-IOV already enabled when driver is probing") Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-23 10:43:28 -05:00
Carl Heymann	e71494ae68	nfp: fix fw dump handling of absolute rtsym size Fix bug that causes _absolute_ rtsym sizes of > 8 bytes (as per symbol table) to result in incorrect space used during a TLV-based debug dump. Detail: The size calculation stage calculates the correct size (size of the rtsym address field == 8), while the dump uses the size in the table to calculate the TLV size to reserve. Symbols with size <= 8 are handled OK due to aligning sizes to 8, but including any absolute symbol with listed size > 8 leads to an ENOSPC error during the dump. Fixes: `da762863ed` ("nfp: fix absolute rtsym handling in debug dump") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-23 10:12:01 -05:00
Quentin Monnet	52be9a7cde	nfp: bpf: use extack support to improve debugging Use the recently added extack support for eBPF offload in the driver. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-22 16:28:32 -05:00
Quentin Monnet	acc2abbbb1	nfp: bpf: plumb extack into functions related to XDP offload Pass a pointer to an extack object to nfp_app_xdp_offload() in order to prepare for extack usage in the nfp driver. Next step will be to forward this extack pointer to nfp_net_bpf_offload(), once this function is able to use it for printing error messages. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-22 16:28:32 -05:00
Pieter Jansen van Vuuren	01c15e93a7	nfp: flower: prioritize stats updates Previously it was possible to interrupt processing stats updates because they were handled in a work queue. Interrupting the stats updates could lead to a situation where we backup the control message queue. This patch moves the stats update processing out of the work queue to be processed as soon as hardware sends a request. Reported-by: Louis Peens <louis.peens@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-21 18:08:05 -05:00
David S. Miller	ea9722e265	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Alexei Starovoitov says: ==================== pull-request: bpf-next 2018-01-19 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) bpf array map HW offload, from Jakub. 2) support for bpf_get_next_key() for LPM map, from Yonghong. 3) test_verifier now runs loaded programs, from Alexei. 4) xdp cpumap monitoring, from Jesper. 5) variety of tests, cleanups and small x64 JIT optimization, from Daniel. 6) user space can now retrieve HW JITed program, from Jiong. Note there is a minor conflict between Russell's arm32 JIT fixes and removal of bpf_jit_enable variable by Daniel which should be resolved by keeping Russell's comment and removing that variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-20 22:03:46 -05:00
Jakub Kicinski	81bd5ded60	nfp: bpf: disable all ctrl vNIC capabilities BPF firmware currently exposes IRQ moderation capability. The driver will make use of it by default, inserting 50 usec delay to every control message exchange. This cuts the number of messages per second we can exchange by almost half. None of the other capabilities make much sense for BPF control vNIC, either. Disable them all. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:19 -05:00
Jakub Kicinski	78a0a65f40	nfp: allow apps to disable ctrl vNIC capabilities Most vNIC capabilities are netdev related. It makes no sense to initialize them and waste FW resources. Some are even counter-productive, like IRQ moderation, which will slow down exchange of control messages. Add to nfp_app a mask of enabled control vNIC capabilities for apps to use. Make flower and BPF enable all capabilities for now. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	545bfa7a6a	nfp: split reading capabilities out of nfp_net_init() nfp_net_init() is a little long and we are about to add more code to reading capabilties. Move the capability reading, parsing and validating out. Only actual initialization will stay in nfp_net_init(). No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	527d7d1b99	nfp: read mailbox address from TLV caps Allow specifying alternative vNIC mailbox location in TLV caps. This way we can size the mailbox to the needs and not necessarily waste 512B of ctrl memory space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	ce991ab666	nfp: read ME frequency from vNIC ctrl memory PCIe island clock frequency is used when converting coalescing parameters from usecs to NFP timestamps. Most chips don't run at 1200MHz, allow FW to provide us with the real frequency. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	73a0329b05	nfp: add TLV capabilities to the BAR NFP is entirely programmable, including the PCI data interface. Using a fixed control BAR layout certainly makes implementations easier, but require careful considerations when space is allocated. Once BAR area is allocated to one feature nothing else can use it. Allocating space statically also requires it to be sized upfront, which leads to either unnecessary limitation or wastage. We currently have a 32bit capability word defined which tells drivers which application FW features are supported. Most of the bits are exhausted. The same bits are also reused for enabling specific features. Bulk of capabilities don't have a need for an enable bit, however, leading to confusion and wastage. TLVs seems like a better fit for expressing capabilities of applications running on programmable hardware. This patch leaves the front of the BAR as is, and declares a TLV capability start at offset 0x58. Most of the space up to 0x0d90 is already allocated, but the used space can be wrapped with RESERVED TLVs. E.g.: Address Type Length 0x0058 RESERVED 0xe00 /* Wrap basic structures / 0x0e5c FEATURE_A 0x004 0x0e64 FEATURE_B 0x004 0x0e6c RESERVED 0x990 / Wrap qeueue stats */ 0x1800 FEATURE_C 0x100 Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	bf245f1fb0	nfp: improve app not found message When driver app matching loaded FW is not found users are faced with: nfp: failed to find app with ID 0x%02x This message does not properly explain that matching driver code is either not built into the driver or the driver is too old. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	3eb47dfca0	nfp: protect each repr pointer individually with RCU Representors are grouped in sets by type. Currently the whole sets are under RCU protection, but individual representor pointers are not. This causes some inconveniences when representors have to be destroyed, because we have to allocate new sets to remove any representors. Protect the individual pointers with RCU. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	e1740fb6c1	nfp: add nfp_reprs_get_locked() helper The write side of repr tables is always done under pf->lock. Add a helper to dereference repr table pointers under protection of that lock. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	bcc93a23ca	nfp: register devlink after app is created Devlink used to have two global locks: devlink lock and port lock, our lock ordering looked like this: devlink lock -> driver's pf->lock -> devlink port lock After recent changes port lock was replaced with per-instance lock. Unfortunately, new per-instance lock is taken on most operations now. This means we can only grab the pf->lock from the port split/unsplit ops. Lock ordering looks like this: devlink lock -> driver's pf->lock -> devlink instance lock Since we can't take pf->lock from most devlink ops, make sure nfp_apps are prepared to service them as soon as devlink is registered. Locking the pf must be pushed down after nfp_app_init() callback. The init order looks like this: nfp_app_init devlink_register nfp_app_start netdev/port_register As soon as app_init is done nfp_apps must be ready to service devlink-related callbacks. apps can only register their own devlink objects from nfp_app_start. Fixes: `2406e7e546` ("devlink: Add per devlink instance lock") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	e1a2599db5	nfp: release global resources only on the remove path NFP app is currently shut down as soon as all the vNICs are gone. This means we can't depend on the app existing throughout the lifetime of the device. Free the app only from PCI remove path. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	aa3f4b69a7	nfp: core: make scalar CPP helpers fail on short accesses Currently the helpers for accessing 4 or 8 byte values over the CPP bus return the length of IO on success. If the IO was short caller has to deal with error handling. The short IO for 4/8B values is completely impractical. Make the helpers return an error if full access was not possible. Fix the few places which are actually dealing with errors correctly, most call sites already only deal with negative return codes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-19 15:44:18 -05:00
Jakub Kicinski	ca027a1c45	nfp: bpf: add short busy wait for FW replies Scheduling out and in for every FW message can slow us down unnecessarily. Our experiments show that even under heavy load the FW responds to 99.9% messages within 200 us. Add a short busy wait before entering the wait queue. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-18 22:54:26 +01:00
Jakub Kicinski	7a0ef69395	bpf: offload: allow array map offload The special handling of different map types is left to the driver. Allow offload of array maps by simply adding it to accepted types. For nfp we have to make sure array elements are not deleted. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-18 22:54:25 +01:00
Jiong Wang	eb1d7db927	nfp: bpf: set new jit info fields This patch set those new jit info fields introduced in this patch set. Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-18 01:26:15 +01:00
David S. Miller	c02b3741eb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Overlapping changes all over. The mini-qdisc bits were a little bit tricky, however. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-17 00:10:42 -05:00
Quentin Monnet	74801e50d5	nfp: bpf: reject program on instructions unknown to the JIT compiler If an eBPF instruction is unknown to the driver JIT compiler, we can reject the program at verification time. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-17 01:15:06 +01:00
Jakub Kicinski	7dfa4d87cf	nfp: bpf: print map lookup problems into verifier log Use the verifier log to output error messages if map lookup can't be offloaded. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-17 01:15:06 +01:00
Jakub Kicinski	0d9c9f0f40	nfp: use the correct index for link speed table sts variable is holding link speed as well as state. We should be using ls to index into ls_to_ethtool. Fixes: `265aeb511b` ("nfp: add support for .get_link_ksettings()") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-16 14:55:07 -05:00
Jakub Kicinski	1bba4c413a	nfp: bpf: implement bpf map offload Plug in to the stack's map offload callbacks for BPF map offload. Get next call needs some special handling on the FW side, since we can't send a NULL pointer to the FW there is a get first entry FW command. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:31 +01:00
Jakub Kicinski	3dd43c3319	nfp: bpf: add support for reading map memory Map memory needs to use 40 bit addressing. Add handling of such accesses. Since 40 bit addresses are formed by using both 32 bit operands we need to pre-calculate the actual address instead of adding in the offset inside the instruction, like we did in 32 bit mode. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	77a3d3113b	nfp: bpf: add verification and codegen for map lookups Verify our current constraints on the location of the key are met and generate the code for calling map lookup on the datapath. New relocation types have to be added - for helpers and return addresses. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	ce4ebfd859	nfp: bpf: add helpers for updating immediate instructions Immediate loads are used to load the return address of a helper. We need to be able to update those loads for relocations. Immediate loads can be slightly more complex and spread over two instructions in general, but here we only care about simple loads of small (< 65k) constants, so complex cases are not handled. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	9d080d5da9	nfp: bpf: parse function call and map capabilities Parse helper function and supported map FW TLV capabilities. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	ff3d43f756	nfp: bpf: implement helpers for FW map ops Implement calls for FW map communication. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	d48ae231c5	nfp: bpf: add basic control channel communication For map support we will need to send and receive control messages. Add basic support for sending a message to FW, and waiting for a reply. Control messages are tagged with a 16 bit ID. Add a simple ID allocator and make sure we don't allow too many messages in flight, to avoid request <> reply mismatches. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	4da98eea79	nfp: bpf: add map data structure To be able to split code into reasonable chunks we need to add the map data structures already. Later patches will add code piece by piece. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:30 +01:00
Jakub Kicinski	0a9c1991f2	bpf: rename bpf_dev_offload -> bpf_prog_offload With map offload coming, we need to call program offload structure something less ambiguous. Pure rename, no functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-14 23:36:29 +01:00
David S. Miller	19d28fbd30	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net BPF alignment tests got a conflict because the registers are output as Rn_w instead of just Rn in net-next, and in net a fixup for a testcase prohibits logical operations on pointers before using them. Also, we should attempt to patch BPF call args if JIT always on is enabled. Instead, if we fail to JIT the subprogs we should pass an error back up and fail immediately. Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-11 22:13:42 -05:00
Jakub Kicinski	fc2336505f	nfp: always unmask aux interrupts at init The link state and exception interrupts may be masked when we probe. The firmware should in theory prevent sending (and automasking) those interrupts if the device is disabled, but if my reading of the FW code is correct there are firmwares out there with race conditions in this area. The interrupt may also be masked if previous driver which used the device was malfunctioning and we didn't load the FW (there is no other good way to comprehensively reset the PF). Note that FW unmasks the data interrupts by itself when vNIC is enabled, such helpful operation is not performed for LSC/EXN interrupts. Always unmask the auxiliary interrupts after request_irq(). On the remove path add missing PCI write flush before free_irq(). Fixes: `4c3523623d` ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-10 15:50:04 -05:00
Quentin Monnet	ff627e3d07	nfp: bpf: reuse verifier log for debug messages Now that `bpf_verifier_log_write()` is exported from the verifier and makes it possible to reuse the verifier log to print messages to the standard output, use this instead of the kernel logs in the nfp driver for printing error messages occurring at verification time. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Nic Viljoen	c087aa8bbf	nfp: bpf: add signed jump insns This patch adds signed jump instructions (jsgt, jsge, jslt, jsle) to the nfp jit. As well as adding the additional required raw assembler branch mask to nfp_asm.h Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Jakub Kicinski	af93d15ac6	nfp: hand over to BPF offload app at coarser granularity Instead of having an app callback per message type hand off all offload-related handling to apps with one "rest of ndo_bpf" callback. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:36 +01:00
Jakub Kicinski	e84797fe15	nfp: bpf: use a large constant in unresolved branches To make absolute relocated branches (branches which will be completely rewritten with br_set_offset()) distinguishable in user space dumps from normal jumps add a large offset to them. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	44a12ecc1c	nfp: bpf: don't depend on high order allocations for program image The translator pre-allocates a buffer of maximal program size. Due to HW/FW limitations the program buffer can't currently be longer than 128Kb, so we used to kmalloc() it, and then map for DMA directly. Now that the late branch resolution is copying the program image anyway, we can just kvmalloc() the buffer. While at it, after translation reallocate the buffer to save space. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	2314fe9ed0	nfp: bpf: relocate jump targets just before the load Don't translate the program assuming it will be loaded at a given address. This will be required for sharing programs between ports of the same NIC, tail calls and subprograms. It will also make the jump targets easier to understand when dumping the program to user space. Translate the program as if it was going to be loaded at address zero. When load happens add the load offset in and set addresses of special branches. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	488feeaf6d	nfp: bpf: add helpers for modifying branch addresses In preparation for better handling of relocations move existing helper for setting branch offset to nfp_asm.c and add two more. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	1549921da3	nfp: bpf: move jump resolution to jit.c Jump target resolution should be in jit.c not offload.c. No functional changes. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	a0f30c97ac	nfp: bpf: allow disabling TC offloads when XDP active TC BPF offload was added first, so we used to assume that the ethtool TC HW offload flag cannot be touched whenever any BPF program is loaded on the NIC. This unncessarily limits changes to the TC flag when offloaded program is XDP. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	ccbdc596f4	nfp: bpf: don't allow changing MTU above BPF offload limit when active When BPF offload is active we need may need to restrict the MTU changes more than just to the limitation of the kernel XDP datapath. Allow the BPF code to veto a MTU change. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	c4f7730be5	nfp: bpf: round up the size of the stack Kernel enforces the alignment of the bottom of the stack, NFP deals with positive offsets better so we should align the top of the stack. Round the stack size to NFP word size (4B). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	8c6a6d9804	nfp: fix incumbent kdoc warnings We should use % instead of @ for documenting preprocessor defines. Add missing documentation of __NFP_REPR_TYPE_MAX. This gets rid of all remaining kdoc warnings in the driver. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
Jakub Kicinski	a9c324be72	nfp: don't try to register XDP rxq structures on control queues Some RX rings are used for control messages, those will not have a netdev pointer in dp. Skip XDP rxq handling on those rings. Fixes: `7f1c684a89` ("nfp: setup xdp_rxq_info") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-01-10 13:49:35 +01:00
David S. Miller	7f0b800048	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2018-01-07 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Add a start of a framework for extending struct xdp_buff without having the overhead of populating every data at runtime. Idea is to have a new per-queue struct xdp_rxq_info that holds read mostly data (currently that is, queue number and a pointer to the corresponding netdev) which is set up during rxqueue config time. When a XDP program is invoked, struct xdp_buff holds a pointer to struct xdp_rxq_info that the BPF program can then walk. The user facing BPF program that uses struct xdp_md for context can use these members directly, and the verifier rewrites context access transparently by walking the xdp_rxq_info and net_device pointers to load the data, from Jesper. 2) Redo the reporting of offload device information to user space such that it works in combination with network namespaces. The latter is reported through a device/inode tuple as similarly done in other subsystems as well (e.g. perf) in order to identify the namespace. For this to work, ns_get_path() has been generalized such that the namespace can be retrieved not only from a specific task (perf case), but also from a callback where we deduce the netns (ns_common) from a netdevice. bpftool support using the new uapi info and extensive test cases for test_offload.py in BPF selftests have been added as well, from Jakub. 3) Add two bpftool improvements: i) properly report the bpftool version such that it corresponds to the version from the kernel source tree. So pick the right linux/version.h from the source tree instead of the installed one. ii) fix bpftool and also bpf_jit_disasm build with bintutils >= 2.9. The reason for the build breakage is that binutils library changed the function signature to select the disassembler. Given this is needed in multiple tools, add a proper feature detection to the tools/build/features infrastructure, from Roman. 4) Implement the BPF syscall command BPF_MAP_GET_NEXT_KEY for the stacktrace map. It is currently unimplemented, but there are use cases where user space needs to walk all stacktrace map entries e.g. for dumping or deleting map entries w/o having to close and recreate the map. Add BPF selftests along with it, from Yonghong. 5) Few follow-up cleanups for the bpftool cgroup code: i) rename the cgroup 'list' command into 'show' as we have it for other subcommands as well, ii) then alias the 'show' command such that 'list' is accepted which is also common practice in iproute2, and iii) remove couple of newlines from error messages using p_err(), from Jakub. 6) Two follow-up cleanups to sockmap code: i) remove the unused bpf_compute_data_end_sk_skb() function and ii) only build the sockmap infrastructure when CONFIG_INET is enabled since it's only aware of TCP sockets at this time, from John. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-07 21:26:31 -05:00
Jesper Dangaard Brouer	7f1c684a89	nfp: setup xdp_rxq_info Driver hook points for xdp_rxq_info: * reg : nfp_net_rx_ring_alloc * unreg: nfp_net_rx_ring_free In struct nfp_net_rx_ring moved member @size into a hole on 64-bit. Thus, the size remaines the same after adding member @xdp_rxq. Cc: oss-drivers@netronome.com Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-01-05 15:21:21 -08:00
Jakub Kicinski	d0adb51edb	nfp: add basic multicast filtering We currently always pass all multicast traffic through. Only set L2MC when actually needed. Since the driver was not making use of the capability to filter out mcast frames, some FW projects don't implement it any more. Don't warn users if capability is not present (like we do for promisc flag). The lack of L2MC capability is assumed to mean all multicast traffic goes through. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-05 13:46:47 -05:00
Dirk van der Merwe	d2c2928d86	nfp: flower: implement the PORT_REIFY message The PORT_REIFY message indicates whether reprs have been created or when they are about to be destroyed. This is necessary so firmware can know which state the driver is in, e.g. the firmware will not send any control messages related to ports when the reprs are destroyed. This prevents nuisance warning messages printed whenever the firmware sends updates for non-existent reprs. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-03 12:17:49 -05:00
Dirk van der Merwe	0f08479143	nfp: add repr_preclean callback Just before a repr is cleaned up, we give the app a chance to perform some preclean configuration while the reprs pointer is still configured for the app. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-03 12:17:38 -05:00
Dirk van der Merwe	c6d20ab4d7	nfp: flower: obtain repr link state only from firmware Instead of starting up reprs assuming that there is link, only respond to the link state reported by firmware. Furthermore, ensure link is down after repr netdevs are created. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-03 12:17:30 -05:00
Jakub Kicinski	cae1927c0b	bpf: offload: allow netdev to disappear while verifier is running To allow verifier instruction callbacks without any extra locking NETDEV_UNREGISTER notification would wait on a waitqueue for verifier to finish. This design decision was made when rtnl lock was providing all the locking. Use the read/write lock instead and remove the workqueue. Verifier will now call into the offload code, so dev_ops are moved to offload structure. Since verifier calls are all under bpf_prog_is_dev_bound() we no longer need static inline implementations to please builds with CONFIG_NET=n. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-31 16:12:23 +01:00
Jakub Kicinski	4f83435ad7	nfp: bpf: allocate vNIC priv for keeping track of the offloaded program After TC offloads were converted to callbacks we have no choice but keep track of the offloaded filter in the driver. Since this change came a little late in the release cycle there were a number of conflicts and allocation of vNIC priv structure seems to have slipped away in linux-next. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-27 20:37:38 -05:00
David S. Miller	fba961ab29	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Lots of overlapping changes. Also on the net-next side the XDP state management is handled more in the generic layers so undo the 'net' nfp fix which isn't applicable in net-next. Include a necessary change by Jakub Kicinski, with log message: ==================== cls_bpf no longer takes care of offload tracking. Make sure netdevsim performs necessary checks. This fixes a warning caused by TC trying to remove a filter it has not added. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-22 11:16:31 -05:00
Jakub Kicinski	d3f89b98e3	nfp: bpf: keep track of the offloaded program After TC offloads were converted to callbacks we have no choice but keep track of the offloaded filter in the driver. The check for nn->dp.bpf_offload_xdp was a stop gap solution to make sure failed TC offload won't disable XDP, it's no longer necessary. nfp_net_bpf_offload() will return -EBUSY on TC vs XDP conflicts. Fixes: `3f7889c4c7` ("net: sched: cls_bpf: call block callbacks for offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:08:18 -05:00
Jakub Kicinski	102740bd94	cls_bpf: fix offload assumptions after callback conversion cls_bpf used to take care of tracking what offload state a filter is in, i.e. it would track if offload request succeeded or not. This information would then be used to issue correct requests to the driver, e.g. requests for statistics only on offloaded filters, removing only filters which were offloaded, using add instead of replace if previous filter was not added etc. This tracking of offload state no longer functions with the new callback infrastructure. There could be multiple entities trying to offload the same filter. Throw out all the tracking and corresponding commands and simply pass to the drivers both old and new bpf program. Drivers will have to deal with offload state tracking by themselves. Fixes: `3f7889c4c7` ("net: sched: cls_bpf: call block callbacks for offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:08:18 -05:00
John Hurley	3ca3059dc3	nfp: flower: compile Geneve encap actions Generate rules for the NFP to encapsulate packets in Geneve tunnels. Move the vxlan action code to generic udp tunnel actions and use core code for both vxlan and Geneve. Only support outputting to well known port 6081. Setting tunnel options is not supported yet. Only attempt to offload if the fw supports Geneve. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-19 14:52:13 -05:00
John Hurley	bedeca15af	nfp: flower: compile Geneve match fields Compile Geneve match fields for offloading to the NFP. The addition of Geneve overflows the 8 bit key_layer field, so apply extended metadata to the match cmsg allowing up to 32 more key_layer fields. Rather than adding new Geneve blocks, move the vxlan code to generic ipv4 udp tunnel structs and use these for both vxlan and Geneve. Matches are only supported when specifically mentioning well known port 6081. Geneve tunnel options are not yet included in the match. Only offload Geneve if the fw supports it - include check for this. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-19 14:52:12 -05:00
John Hurley	739973486f	nfp: flower: read extra feature support from fw Extract the _abi_flower_extra_features symbol from the fw which gives a 64 bit bitmap of new features (on top of the flower base support) that the fw can offload. Store this bitmap in the priv data associated with each app. If the symbol does not exist, set the bitmap to 0. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-19 14:52:12 -05:00
John Hurley	574f1e9ccc	nfp: flower: remove unused tun_mask variable The tunnel dest IP is required for separate offload to the NFP. It is already verified that a dest IP must be present and must be an exact match in the flower rule. Therefore, we can just extract the IP from the generated offload rule and remove the unused mask variable. The function is then no longer required to return the IP separately. Because tun_dst is localised to tunnel matches, move the declaration to the tunnel if branch. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-19 14:52:12 -05:00
David S. Miller	59436c9ee1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== pull-request: bpf-next 2017-12-18 The following pull-request contains BPF updates for your net-next tree. The main changes are: 1) Allow arbitrary function calls from one BPF function to another BPF function. As of today when writing BPF programs, __always_inline had to be used in the BPF C programs for all functions, unnecessarily causing LLVM to inflate code size. Handle this more naturally with support for BPF to BPF calls such that this __always_inline restriction can be overcome. As a result, it allows for better optimized code and finally enables to introduce core BPF libraries in the future that can be reused out of different projects. x86 and arm64 JIT support was added as well, from Alexei. 2) Add infrastructure for tagging functions as error injectable and allow for BPF to return arbitrary error values when BPF is attached via kprobes on those. This way of injecting errors generically eases testing and debugging without having to recompile or restart the kernel. Tags for opting-in for this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef. 3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper call for XDP programs. First part of this work adds handling of BPF capabilities included in the firmware, and the later patches add support to the nfp verifier part and JIT as well as some small optimizations, from Jakub. 4) The bpftool now also gets support for basic cgroup BPF operations such as attaching, detaching and listing current BPF programs. As a requirement for the attach part, bpftool can now also load object files through 'bpftool prog load'. This reuses libbpf which we have in the kernel tree as well. bpftool-cgroup man page is added along with it, from Roman. 5) Back then commit `e87c6bc385` ("bpf: permit multiple bpf attachments for a single perf event") added support for attaching multiple BPF programs to a single perf event. Given they are configured through perf's ioctl() interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF command in this work in order to return an array of one or multiple BPF prog ids that are currently attached, from Yonghong. 6) Various minor fixes and cleanups to the bpftool's Makefile as well as a new 'uninstall' and 'doc-uninstall' target for removing bpftool itself or prior installed documentation related to it, from Quentin. 7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is required for the test_dev_cgroup test case to run, from Naresh. 8) Fix reporting of XDP prog_flags for nfp driver, from Jakub. 9) Fix libbpf's exit code from the Makefile when libelf was not found in the system, also from Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-18 10:51:06 -05:00
Jakub Kicinski	4a29c0db69	nfp: set flags in the correct member of netdev_bpf netdev_bpf.flags is the input member for installing the program. netdev_bpf.prog_flags is the output member for querying. Set the correct one on query. Fixes: `92f0292b35` ("net: xdp: report flags program was installed with on query") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-17 20:41:59 +01:00
Jakub Kicinski	0bce7c9a60	nfp: bpf: correct printk formats for size_t Build bot reported warning about invalid printk formats on 32bit architectures. Use %zu for size_t and %zd ptr diff. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 22:01:05 +01:00
Carl Heymann	28b2d7d04b	nfp: fix XPB register reads in debug dump For XPB registers reads, some island IDs require special handling (e.g. ARM island), which is already taken care of in nfp_xpb_readl(), so use that instead of a straight CPP read. Without this fix all "xpbm:ArmIsldXpbmMap.*" registers are reported as 0xffffffff. It has also been observed to cause a system reboot. With this fix correct values are reported, none of which are 0xffffffff. The values may be read using ethtool debug level 2. # ethtool -W <netdev> 2 # ethtool -w <netdev> data dump.dat Fixes: `0e6c4955e1` ("nfp: dump CPP, XPB and direct ME CSRs") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 12:48:45 -05:00
Carl Heymann	da762863ed	nfp: fix absolute rtsym handling in debug dump In TLV-based ethtool debug dumps, don't do a CPP read for absolute rtsyms, use the addr field in the symbol table directly as the value. Without this fix rtsym gro_release_ring_0 is 4 bytes of zeros. With this fix the correct value, 0x0000004a 0x00000000 is reported. The values may be read using ethtool debug level 2. # ethtool -W <netdev> 2 # ethtool -w <netdev> data dump.dat Fixes: `e1e798e3fd` ("nfp: dump rtsyms") Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 12:48:45 -05:00
Dirk van der Merwe	7a74156591	nfp: implement firmware flashing Firmware flashing takes around 60s (specified to not take more than 70s). Prevent hogging the RTNL lock in this time and make use of the longer timeout for the NSP command. The timeout is set to 2.5 * 70 seconds. We only allow flashing the firmware from reprs or PF netdevs. VFs do not have an app reference. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 12:26:12 -05:00
Dirk van der Merwe	87a23801e5	nfp: extend NSP infrastructure for configurable timeouts The firmware flashing NSP operation takes longer to execute than the current default timeout. We need a mechanism to set a longer timeout for some commands. This patch adds the infrastructure to this. The default timeout is still 30 seconds. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-15 12:26:12 -05:00
Jakub Kicinski	8231f84441	nfp: bpf: optimize the adjust_head calls in trivial cases If the program is simple and has only one adjust head call with constant parameters, we can check that the call will always succeed at translation time. We need to track the location of the call and make sure parameters are always the same. We also have to check the parameters against datapath constraints and ETH_HLEN. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Jakub Kicinski	0d49eaf4db	nfp: bpf: add basic support for adjust head call Support bpf_xdp_adjust_head(). We need to check whether the packet offset after adjustment is within datapath's limits. We also check if the frame is at least ETH_HLEN long (similar to the kernel implementation). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Jakub Kicinski	2cb230bded	nfp: bpf: prepare for call support Add skeleton of verifier checks and translation handler for call instructions. Make sure jump target resolution will not treat them as jumps. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Jakub Kicinski	77a844ee65	nfp: bpf: prepare for parsing BPF FW capabilities BPF FW creates a run time symbol called bpf_capabilities which contains TLV-formatted capability information. Allocate app private structure to store parsed capabilities and add a skeleton of parsing logic. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Jakub Kicinski	a351ab565c	nfp: add nfp_cpp_area_size() accessor Allow users outside of core reading area sizes. This was not needed previously because whatever entity created the area would usually know what size it asked for. The nfp_rtsym_map() helper, however, will allocate the area based on the size of an RT-symbol with given name. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-15 14:18:18 +01:00
Carl Heymann	92a54f4a47	nfp: debug dump - decrease endian conversions Convert the requested dump level parameter to big-endian at the start of nfp_net_dump_calculate_size() and nfp_net_dump_populate_buffer(), then compare and assign it directly where needed in the traversal and prolog code. This decreases the total number of conversions used. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 12:08:13 -05:00
John Hurley	197171e5ba	nfp: flower: remove unused defines Delete match field defines that are not supported at this time. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 12:08:04 -05:00
John Hurley	a427673e1f	nfp: flower: remove dead code paths Port matching is selected by default on every rule so remove check for it and delete 'else' side of the statement. Remove nfp_flower_meta_one as now it will not feature in the code. Rename nfp_flower_meta_two given that one has been removed. 'Additional metadata' if statement can never be true so remove it as well. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 12:07:57 -05:00
John Hurley	de7d954984	nfp: flower: do not assume mac/mpls matches Remove the matching of mac/mpls as a default selection. These are not necessarily set by a TC rule (unlike the port). Previously a mac/mpls field would exist in every match and be masked out if not used. This patch has no impact on functionality but removes unnessary memory assignment in the match cmsg. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-11 12:07:47 -05:00
David S. Miller	51e18a453f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Conflict was two parallel additions of include files to sch_generic.c, no biggie. Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-09 22:09:55 -05:00
Cong Wang	9f8a739e72	act_mirred: get rid of tcfm_ifindex from struct tcf_mirred tcfm_dev always points to the correct netdev and we already hold a refcnt, so no need to use tcfm_ifindex to lookup again. If we would support moving target netdev across netns, using pointer would be better than ifindex. This also fixes dumping obsolete ifindex, now after the target device is gone we just dump 0 as ifindex. Cc: Jiri Pirko <jiri@mellanox.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-06 14:50:13 -05:00
Carl Heymann	60b84a9b38	nfp: dump indirect ME CSRs - The spec defines CSR address ranges for indirect ME CSRs. For Each TLV chunk in the spec, dump a chunk that includes the spec and the data over the defined address range. - Each indirect CSR has 8 contexts. To read one context, first write the context to a specific derived address, read it back, and then read the register value. - For each address, read and dump all 8 contexts in this manner. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:03 -05:00
Carl Heymann	0e6c4955e1	nfp: dump CPP, XPB and direct ME CSRs - The spec defines CSR address ranges for these types. - Dump each TLV chunk in the spec as a chunk that includes the spec and the data over the defined address range. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:02 -05:00
Carl Heymann	e9364d30d5	nfp: dump firmware name Dump FW name as TLV, based on dump specification. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:02 -05:00
Carl Heymann	10144de383	nfp: dump single hwinfo field by key - Add spec TLV for hwinfo field, containing key string as data. - Add dump TLV for hwinfo field, with data being key and value as packed zero-terminated strings. - If specified hwinfo field is not found, dump the spec TLV as -ENOENT error. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:02 -05:00
Carl Heymann	24ff8455af	nfp: dump all hwinfo - Dump hwinfo as separate TLV chunk, in a packed format containing zero-separated key and value strings. - This provides additional debug context, if requested by the dumpspec. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:02 -05:00
Carl Heymann	e1e798e3fd	nfp: dump rtsyms - Support rtsym TLVs. - If specified rtsym is not found, dump the spec TLV as -ENOENT error. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Carl Heymann	f3682c7866	nfp: dumpspec TLV traversal - Perform dumpspec traversals for calculating size and populating the dump. - Initially, wrap all spec TLVs in dump error TLVs (changed by later patches in the series). Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Carl Heymann	f7852b8e9e	nfp: dump prolog - Use a TLV structure, with the typed chunks aligned to 8-byte sizes. - Dump numeric fields as big-endian. - Prolog contains the dump level. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Carl Heymann	8a925303b6	nfp: load debug dump spec Load the TLV-based binary specification of what needs to be included in a dump, from the "_abi_dump_spec" rtsymbol. If the symbol is not defined, then dumps for levels >= 1 are not supported. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Carl Heymann	d79e19f564	nfp: debug dump ethtool ops - Skeleton code to perform a binary debug dump via ethtoolops "set_dump", "get_dump_flags" and "get_dump_data", i.e. the ethtool -W/w mechanism. - Skeleton functions for debugdump operations provided. - An integer "dump level" can be specified, this is stored between ethtool invocations. Dump level 0 is still the "arm.diag" resource for backward compatibility. Other dump levels each define a set of state information to include in the dump, driven by a spec from FW. Signed-off-by: Carl Heymann <carl.heymann@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 15:01:01 -05:00
Pieter Jansen van Vuuren	42d779ffc1	nfp: fix port stats for mac representors Previously we swapped the tx_packets, tx_bytes and tx_dropped counters with rx_packets, rx_bytes and rx_dropped counters, respectively. This behaviour is correct and expected for VF representors but it should not be swapped for physical port mac representors. Fixes: `eadfa4c3be` ("nfp: add stats and xmit helpers for representors") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-05 11:29:36 -05:00
Jakub Kicinski	bd0b2e7fe6	net: xdp: make the stack take care of the tear down Since day one of XDP drivers had to remember to free the program on the remove path. This leads to code duplication and is error prone. Make the stack query the installed programs on unregister and if something is installed, remove the program. Freeing of program attached to XDP generic is moved from free_netdev() as well. Because the remove will now be called before notifiers are invoked, BPF offload state of the program will not get destroyed before uninstall. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-03 00:27:57 +01:00
Jakub Kicinski	92f0292b35	net: xdp: report flags program was installed with on query Some drivers enforce that flags on program replacement and removal must match the flags passed on install. This leaves the possibility open to enable simultaneous loading of XDP programs both to HW and DRV. Allow such drivers to report the flags back to the stack. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-03 00:27:57 +01:00
Jiong Wang	6bc7103c89	nfp: bpf: detect load/store sequences lowered from memory copy This patch add the optimization frontend, but adding a new eBPF IR scan pass "nfp_bpf_opt_ldst_gather". The pass will traverse the IR to recognize the load/store pairs sequences that come from lowering of memory copy builtins. The gathered memory copy information will be kept in the meta info structure of the first load instruction in the sequence and will be consumed by the optimization backend added in the previous patches. NOTE: a sequence with cross memory access doesn't qualify this optimization, i.e. if one load in the sequence will load from place that has been written by previous store. This is because when we turn the sequence into single CPP operation, we are reading all contents at once into NFP transfer registers, then write them out as a whole. This is not identical with what the original load/store sequence is doing. Detecting cross memory access for two random pointers will be difficult, fortunately under XDP/eBPF's restrictied runtime environment, the copy normally happen among map, packet data and stack, they do not overlap with each other. And for cases supported by NFP, cross memory access will only happen on PTR_TO_PACKET. Fortunately for this, there is ID information that we could do accurate memory alias check. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	8c90053858	nfp: bpf: implement memory bulk copy for length bigger than 32-bytes When the gathered copy length is bigger than 32-bytes and within 128-bytes (the maximum length a single CPP Pull/Push request can finish), the strategy of read/write are changeed into: * Read. - use direct reference mode when length is within 32-bytes. - use indirect mode when length is bigger than 32-bytes. * Write. - length <= 8-bytes use write8 (direct_ref). - length <= 32-byte and 4-bytes aligned use write32 (direct_ref). - length <= 32-bytes but not 4-bytes aligned use write8 (indirect_ref). - length > 32-bytes and 4-bytes aligned use write32 (indirect_ref). - length > 32-bytes and not 4-bytes aligned and <= 40-bytes use write32 (direct_ref) to finish the first 32-bytes. use write8 (direct_ref) to finish all remaining hanging part. - length > 32-bytes and not 4-bytes aligned use write32 (indirect_ref) to finish those 4-byte aligned parts. use write8 (direct_ref) to finish all remaining hanging part. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	9879a3814b	nfp: bpf: implement memory bulk copy for length within 32-bytes For NFP, we want to re-group a sequence of load/store pairs lowered from memcpy/memmove into single memory bulk operation which then could be accelerated using NFP CPP bus. This patch extends the existing load/store auxiliary information by adding two new fields: struct bpf_insn paired_st; s16 ldst_gather_len; Both fields are supposed to be carried by the the load instruction at the head of the sequence. "paired_st" is the corresponding store instruction at the head and "ldst_gather_len" is the gathered length. If "ldst_gather_len" is negative, then the sequence is doing memory load/store in descending order, otherwise it is in ascending order. We need this information to detect overlapped memory access. This patch then optimize memory bulk copy when the copy length is within 32-bytes. The strategy of read/write used is: Read. Use read32 (direct_ref), always. * Write. - length <= 8-bytes write8 (direct_ref). - length <= 32-bytes and is 4-byte aligned write32 (direct_ref). - length <= 32-bytes but is not 4-byte aligned write8 (indirect_ref). NOTE: the optimization should not change program semantics. The destination register of the last load instruction should contain the same value before and after this optimization. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	5e4d6d2093	nfp: bpf: factor out is_mbpf_load & is_mbpf_store It is usual that we need to check if one BPF insn is for loading/storeing data from/to memory. Therefore, it makes sense to factor out related code to become common helper functions. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jakub Kicinski	5468a8b929	nfp: bpf: encode indirect commands Add support for emitting commands with field overwrites. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	3239e7bb28	nfp: bpf: correct the encoding for No-Dest immed When immed is used with No-Dest, the emitter should use reg.dst instead of reg.areg for the destination, using the latter will actually encode register zero. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	08859f159e	nfp: bpf: relax source operands check The NFP normally requires the source operands to be difference addressing modes, but we should rule out the very special NN_REG_NONE type. There are instruction that ignores both A/B operands, for example: local_csr_rd For these instructions, we might pass the same operand type, NN_REG_NONE, for both A/B operands. NOTE: in current NFP ISA, it is only possible for instructions with unrestricted operands to take none operands, but in case there is new and similar instructoin in restricted form, they would follow similar rules, so swreg_to_restricted is updated as well. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	29fe46efba	nfp: bpf: don't do ld/shifts combination if shifts are jump destination If any of the shift insns in the ld/shift sequence is jump destination, don't do combination. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	1266f5d655	nfp: bpf: don't do ld/mask combination if mask is jump destination If the mask insn in the ld/mask pair is jump destination, then don't do combination. Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:20 +01:00
Jiong Wang	a09d5c52c4	nfp: bpf: flag jump destination to guide insn combine optimizations NFP eBPF offload JIT engine is doing some instruction combine based optimizations which however must not be safe if the combined sequences are across basic block boarders. Currently, there are post checks during fixing jump destinations. If the jump destination is found to be eBPF insn that has been combined into another one, then JIT engine will raise error and abort. This is not optimal. The JIT engine ought to disable the optimization on such cross-bb-border sequences instead of abort. As there is no control flow information in eBPF infrastructure that we can't do basic block based optimizations, this patch extends the existing jump destination record pass to also flag the jump destination, then in instruction combine passes we could skip the optimizations if insns in the sequence are jump targets. Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:19 +01:00
Jiong Wang	5b674140ad	nfp: bpf: record jump destination to simplify jump fixup eBPF insns are internally organized as dual-list inside NFP offload JIT. Random access to an insn needs to be done by either forward or backward traversal along the list. One place we need to do such traversal is at nfp_fixup_branches where one traversal is needed for each jump insn to find the destination. Such traversals could be avoided if jump destinations are collected through a single travesal in a pre-scan pass, and such information could also be useful in other places where jump destination info are needed. This patch adds such jump destination collection in nfp_prog_prepare. Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:19 +01:00
Jiong Wang	854dc87d1a	nfp: bpf: support backward jump This patch adds support for backward jump on NFP. - restrictions on backward jump in various functions have been removed. - nfp_fixup_branches now supports backward jump. There is one thing to note, currently an input eBPF JMP insn may generate several NFP insns, for example, NFP imm move insn A \ NFP compare insn B --> 3 NFP insn jited from eBPF JMP insn M NFP branch insn C / --- NFP insn X --> 1 NFP insn jited from eBPF insn N --- ... therefore, we are doing sanity check to make sure the last jited insn from an eBPF JMP is a NFP branch instruction. Once backward jump is allowed, it is possible an eBPF JMP insn is at the end of the program. This is however causing trouble for the sanity check. Because the sanity check requires the end index of the NFP insns jited from one eBPF insn while only the start index is recorded before this patch that we can only get the end index by: start_index_of_the_next_eBPF_insn - 1 or for the above example: start_index_of_eBPF_insn_N (which is the index of NFP insn X) - 1 nfp_fixup_branches was using nfp_for_each_insn_walk2 to expose next insn to each iteration during the traversal so the last index could be calculated from which. Now, it needs some extra code to handle the last insn. Meanwhile, the use of walk2 is actually unnecessary, we could simply use generic single instruction walk to do this, the next insn could be easily calculated using list_next_entry. So, this patch migrates the jump fixup traversal method to list_for_each_entry, this simplifies the code logic a little bit. The other thing to note is a new state variable "last_bpf_off" is introduced to track the index of the last jited NFP insn. This is necessary because NFP is generating special purposes epilogue sequences, so the index of the last jited NFP insn is not always nfp_prog->prog_len - 1. Suggested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:19 +01:00
Jakub Kicinski	a646c9b2da	nfp: fix old kdoc issues Since commit `3a025e1d1c` ("Add optional check for bad kernel-doc comments") when built with W=1 build will complain about kdoc errors. Fix the kdoc issues we have. kdoc is still confused by defines in nfp_net_ctrl.h but those are not really errors. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-12-01 20:59:19 +01:00
David S. Miller	e4be7baba8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2017-11-23 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Several BPF offloading fixes, from Jakub. Among others: - Limit offload to cls_bpf and XDP program types only. - Move device validation into the driver and don't make any assumptions about the device in the classifier due to shared blocks semantics. - Don't pass offloaded XDP program into the driver when it should be run in native XDP instead. Offloaded ones are not JITed for the host in such cases. - Don't destroy device offload state when moved to another namespace. - Revert dumping offload info into user space for now, since ifindex alone is not sufficient. This will be redone properly for bpf-next tree. 2) Fix test_verifier to avoid using bpf_probe_write_user() helper in test cases, since it's dumping a warning into kernel log which may confuse users when only running tests. Switch to use bpf_trace_printk() instead, from Yonghong. 3) Several fixes for correcting ARG_CONST_SIZE_OR_ZERO semantics before it becomes uabi, from Gianluca. More specifically: - Add a type ARG_PTR_TO_MEM_OR_NULL that is used only by bpf_csum_diff(), where the argument is either a valid pointer or NULL. The subsequent ARG_CONST_SIZE_OR_ZERO then enforces a valid pointer in case of non-0 size or a valid pointer or NULL in case of size 0. Given that, the semantics for ARG_PTR_TO_MEM in combination with ARG_CONST_SIZE_OR_ZERO are now such that in case of size 0, the pointer must always be valid and cannot be NULL. This fix in semantics allows for bpf_probe_read() to drop the recently added size == 0 check in the helper that would become part of uabi otherwise once released. At the same time we can then fix bpf_probe_read_str() and bpf_perf_event_output() to use ARG_CONST_SIZE_OR_ZERO instead of ARG_CONST_SIZE in order to fix recently reported issues by Arnaldo et al, where LLVM optimizes two boundary checks into a single one for unknown variables where the verifier looses track of the variable bounds and thus rejects valid programs otherwise. 4) A fix for the verifier for the case when it detects comparison of two constants where the branch is guaranteed to not be taken at runtime. Verifier will rightfully prune the exploration of such paths, but we still pass the program to JITs, where they would complain about using reserved fields, etc. Track such dead instructions and sanitize them with mov r0,r0. Rejection is not possible since LLVM may generate them for valid C code and doesn't do as much data flow analysis as verifier. For bpf-next we might implement removal of such dead code and adjust branches instead. Fix from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-24 02:33:01 +09:00
Jakub Kicinski	b48b1f7ac7	nfp: flower: add missing kdoc Commit `0115552eac` ("nfp: remove false positive offloads in flower vxlan") missed adding kdoc for a new parameter of nfp_flower_add_offload(). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-21 20:24:37 +09:00
Jakub Kicinski	288b3de55a	bpf: offload: move offload device validation out to the drivers With TC shared block changes we can't depend on correct netdev pointer being available in cls_bpf. Move the device validation to the driver. Core will only make sure that offloaded programs are always attached in the driver (or in HW by the driver). We trust that drivers which implement offload callbacks will perform necessary checks. Moving the checks to the driver is generally a useful thing, in practice the check should be against a switchdev instance, not a netdev, given that most ASICs will probably allow using the same program on many ports. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2017-11-21 00:37:35 +01:00
Linus Torvalds	8170024750	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Revert regression inducing change to the IPSEC template resolver, from Steffen Klassert. 2) Peeloffs can cause the wrong sk to be waken up in SCTP, fix from Xin Long. 3) Min packet MTU size is wrong in cpsw driver, from Grygorii Strashko. 4) Fix build failure in netfilter ctnetlink, from Arnd Bergmann. 5) ISDN hisax driver checks pnp_irq() for errors incorrectly, from Arvind Yadav. 6) Fix fealnx driver build failure on MIPS, from Huacai Chen. 7) Fix into leak in SCTP, the scope_id of socket addresses is not always filled in. From Eric W. Biederman. 8) MTU inheritance between physical function and representor fix in nfp driver, from Dirk van der Merwe. 9) Fix memory leak in rsi driver, from Colin Ian King. 10) Fix expiration and generation ID handling of cached ipv4 redirect routes, from Xin Long. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (40 commits) net: usb: hso.c: remove unneeded DRIVER_LICENSE #define ibmvnic: fix dma_mapping_error call ipvlan: NULL pointer dereference panic in ipvlan_port_destroy route: also update fnhe_genid when updating a route cache route: update fnhe_expires for redirect when the fnhe exists sctp: set frag_point in sctp_setsockopt_maxseg correctly rsi: fix memory leak on buf and usb_reg_buf net/netlabel: Add list_next_rcu() in rcu_dereference(). nfp: remove false positive offloads in flower vxlan nfp: register flower reprs for egress dev offload nfp: inherit the max_mtu from the PF netdev nfp: fix vlan receive MAC statistics typo nfp: fix flower offload metadata flag usage virto_net: remove empty file 'virtio_net.' net/sctp: Always set scope_id in sctp_inet6_skb_msgname fealnx: Fix building error on MIPS isdn: hisax: Fix pnp_irq's error checking for setup_teles3 isdn: hisax: Fix pnp_irq's error checking for setup_sedlbauer_isapnp isdn: hisax: Fix pnp_irq's error checking for setup_niccy isdn: hisax: Fix pnp_irq's error checking for setup_ix1micro ...	2017-11-17 20:18:37 -08:00
John Hurley	0115552eac	nfp: remove false positive offloads in flower vxlan Pass information to the match offload on whether or not the repr is the ingress or egress dev. Only accept tunnel matches if repr is the egress dev. This means rules such as the following are successfully offloaded: tc .. add dev vxlan0 .. enc_dst_port 4789 .. action redirect dev nfp_p0 While rules such as the following are rejected: tc .. add dev nfp_p0 .. enc_dst_port 4789 .. action redirect dev vxlan0 Also reject non tunnel flows that are offloaded to an egress dev. Non tunnel matches assume that the offload dev is the ingress port and offload a match accordingly. Fixes: `611aec101a` ("nfp: compile flower vxlan tunnel metadata match fields") Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:36 +09:00
John Hurley	1a24d4f9c0	nfp: register flower reprs for egress dev offload Register a callback for offloading flows that have a repr as their egress device. The new egdev_register function is added to net-next for the 4.15 release. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:36 +09:00
Dirk van der Merwe	743ba5b47f	nfp: inherit the max_mtu from the PF netdev The PF netdev is used for data transfer for reprs, so reprs inherit the maximum MTU settings of the PF netdev. Fixes: `5de73ee467` ("nfp: general representor implementation") Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:36 +09:00
Pieter Jansen van Vuuren	745eaf9afe	nfp: fix vlan receive MAC statistics typo Correct typo in vlan receive MAC stats. Previously the MAC statistics reported in ethtool for vlan receive contained a typo resulting in ethtool reporting rx_vlan_reveive_ok instead of rx_vlan_received_ok. Fixes: `a5950182c0` ("nfp: map mac_stats and vf_cfg BARs") Fixes: `098ce840c9` ("nfp: report MAC statistics in ethtool") Reported-by: Brendan Galloway <brendan.galloway@netronome.com> Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:35 +09:00
Pieter Jansen van Vuuren	6c3ab204f4	nfp: fix flower offload metadata flag usage Hardware has no notion of new or last mask id, instead it makes use of the message type (i.e. add flow or del flow) in combination with a single bit in metadata flags to determine when to add or delete a mask id. Previously we made use of the new or last flags to indicate that a new mask should be allocated or deallocated, respectively. This incorrect behaviour is fixed by making use single bit in metadata flags to indicate mask allocation or deallocation. Fixes: `43f84b72c5` ("nfp: add metadata to each flow offload") Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-17 14:09:35 +09:00
Linus Torvalds	7c225c69f8	Merge branch 'akpm' (patches from Andrew) Merge updates from Andrew Morton: - a few misc bits - ocfs2 updates - almost all of MM * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (131 commits) memory hotplug: fix comments when adding section mm: make alloc_node_mem_map a void call if we don't have CONFIG_FLAT_NODE_MEM_MAP mm: simplify nodemask printing mm,oom_reaper: remove pointless kthread_run() error check mm/page_ext.c: check if page_ext is not prepared writeback: remove unused function parameter mm: do not rely on preempt_count in print_vma_addr mm, sparse: do not swamp log with huge vmemmap allocation failures mm/hmm: remove redundant variable align_end mm/list_lru.c: mark expected switch fall-through mm/shmem.c: mark expected switch fall-through mm/page_alloc.c: broken deferred calculation mm: don't warn about allocations which stall for too long fs: fuse: account fuse_inode slab memory as reclaimable mm, page_alloc: fix potential false positive in __zone_watermark_ok mm: mlock: remove lru_add_drain_all() mm, sysctl: make NUMA stats configurable shmem: convert shmem_init_inodecache() to void Unify migrate_pages and move_pages access checks mm, pagevec: rename pagevec drained field ...	2017-11-15 19:42:40 -08:00
Mel Gorman	453f85d43f	mm: remove __GFP_COLD As the page free path makes no distinction between cache hot and cold pages, there is no real useful ordering of pages in the free list that allocation requests can take advantage of. Juding from the users of __GFP_COLD, it is likely that a number of them are the result of copying other sites instead of actually measuring the impact. Remove the __GFP_COLD parameter which simplifies a number of paths in the page allocator. This is potentially controversial but bear in mind that the size of the per-cpu pagelists versus modern cache sizes means that the whole per-cpu list can often fit in the L3 cache. Hence, there is only a potential benefit for microbenchmarks that alloc/free pages in a tight loop. It's even worse when THP is taken into account which has little or no chance of getting a cache-hot page as the per-cpu list is bypassed and the zeroing of multiple pages will thrash the cache anyway. The truncate microbenchmarks are not shown as this patch affects the allocation path and not the free path. A page fault microbenchmark was tested but it showed no sigificant difference which is not surprising given that the __GFP_COLD branches are a miniscule percentage of the fault path. Link: http://lkml.kernel.org/r/20171018075952.10627-9-mgorman@techsingularity.net Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Johannes Weiner <hannes@cmpxchg.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-11-15 18:21:06 -08:00
Manish Kurup	bf068bdd3c	nfp flower action: Modified to use VLAN helper functions Modified netronome nfp flower action to use VLAN helper functions instead of accessing/referencing TC act_vlan private structures directly. Reviewed-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com> Signed-off-by: Manish Kurup <manish.kurup@verizon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-10 15:32:20 +09:00
Dirk van der Merwe	0d08709383	nfp: implement ethtool FEC mode settings Add support in the driver ethtool ops to modify the NFP FEC modes. The FEC modes can be set for vNIC associated with physical ports or for MAC representor netdevs. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:23:27 +09:00
Dirk van der Merwe	b471232e2c	nfp: add helpers for FEC support Implement helpers to determine and modify FEC modes via the NSP. The NSP advertises FEC capabilities on a per port basis and provides support for: * Auto mode selection * Reed Solomon * BaseR * None/Off Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-11-05 23:23:26 +09:00

... 2 3 4 5 6 ...

781 Commits