linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2025-01-13 17:36:25 +07:00

Author	SHA1	Message	Date
David Howells	dfc3da4404	rxrpc: Need to start the resend timer on initial transmission When a DATA packet has its initial transmission, we may need to start or adjust the resend timer. Without this we end up relying on being sent a NACK to initiate the resend. Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-23 14:05:12 +01:00
David Howells	98dafac569	rxrpc: Use before_eq() and friends to compare serial numbers before_eq() and friends should be used to compare serial numbers (when not checking for (non)equality) rather than casting to int, subtracting and checking the result. Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-23 14:05:08 +01:00
David S. Miller	7a048d5adc	Merge branch 'bpf-helper-improvements' Daniel Borkmann says: ==================== Few minor BPF helper improvements Just a few minor improvements around BPF helpers, first one is a fix but given this late stage and that it's not really a critical one, I think net-next is just fine. For details please see the individual patches. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:40:36 -04:00
Daniel Borkmann	7a4b28c6cc	bpf: add helper to invalidate hash Add a small helper that complements `36bbef52c7` ("bpf: direct packet write and access for helpers for clsact progs") for invalidating the current skb->hash after mangling on headers via direct packet write. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:40:28 -04:00
Daniel Borkmann	669dc4d76d	bpf: use bpf_get_smp_processor_id_proto instead of raw one Same motivation as in commit `80b48c4457` ("bpf: don't use raw processor id in generic helper"), but this time for XDP typed programs. Thus, allow for preemption checks when we have DEBUG_PREEMPT enabled, and otherwise use the raw variant. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:40:28 -04:00
Daniel Borkmann	2d48c5f933	bpf: use skb_to_full_sk helper in bpf_skb_under_cgroup We need to use skb_to_full_sk() helper introduced in commit `bd5eb35f16` ("xfrm: take care of request sockets") as otherwise we miss tcp synack messages, since ownership is on request socket and therefore it would miss the sk_fullsock() check. Use skb_to_full_sk() as also done similarly in the bpf_get_cgroup_classid() helper via `2309236c13` ("cls_cgroup: get sk_classid only from full sockets") fix to not let this fall through. Fixes: `4a482f34af` ("cgroup: bpf: Add bpf_skb_in_cgroup_proto") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:40:27 -04:00
David S. Miller	c14fec3969	Merge branch 'hv_netvsc-next' Stephen Hemminger says: ==================== hv_netvsc changes These are mostly about improving the handling of interaction between the virtual network device (netvsc) and the SR-IOV VF network device. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:54 -04:00
Stephen Hemminger	f7ad75b753	hv_netvsc: count multicast packets received Useful for debugging issues with multicast and SR-IOV to keep track of number of received multicast packets. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:49 -04:00
Stephen Hemminger	9cbcc42806	hv_netvsc: remove VF in flight counters Since VF reference is now protected by RCU, no longer need the VF usage counter and can use device flags to see whether to inject or not. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:49 -04:00
Stephen Hemminger	f207c10d98	hv_netvsc: use RCU to protect vf_netdev The vf_netdev pointer in the netvsc device context can simply be protected by RCU because network device destruction is already RCU synchronized. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:49 -04:00
Stephen Hemminger	e8ff40d4bf	hv_netvsc: improve VF device matching The code to associate netvsc and VF devices can be made less error prone by using a better matching algorithms. On registration, use the permanent address which avoids any possible issues caused by device MAC address being changed. For all other callbacks, search by the netdevice pointer value to ensure getting the correct network device. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:49 -04:00
Stephen Hemminger	ee837a1373	hv_netvsc: simplify callback event code The callback handler for netlink events can be simplified: * Consolidate check for netlink callback events about this driver itself. * Ignore non-Ethernet devices. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:49 -04:00
Stephen Hemminger	07d0f0008c	hv_netvsc: dev hold/put reference to VF The netvsc driver holds a pointer to the virtual function network device if managing SR-IOV association. In order to ensure that the VF network device does not disappear, it should be using dev_hold/dev_put to get a reference count. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:48 -04:00
Stephen Hemminger	17db4bcef3	hv_netvsc: use consume_skb Packets that are transmitted in normal path should use consume_skb instead of kfree_skb. This allows for better tracing of packet drops. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:39:48 -04:00
David S. Miller	dd5a3005eb	Merge branch 'dsa-port-fast-ageing' Vivien Didelot says: ==================== net: dsa: add port fast ageing Today the DSA drivers are in charge of flushing the MAC addresses associated to a port when its STP state changes from Learning or Forwarding, to Disabled or Blocking or Listening. This makes the drivers more complex and hides this generic switch logic. This patchset introduces a new optional port_fast_age operation to dsa_switch_ops, to move this logic to the DSA layer and keep drivers simple. b53 and mv88e6xxx are updated accordingly. ==================== Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:38:59 -04:00
Vivien Didelot	749efcb814	net: dsa: mv88e6xxx: implement DSA port fast ageing Now that the DSA layer handles port fast ageing on correct STP change, simplify _mv88e6xxx_port_state and implement mv88e6xxx_port_fast_age. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:38:50 -04:00
Vivien Didelot	597698f1e0	net: dsa: b53: implement DSA port fast ageing Remove the fast ageing logic from b53_br_set_stp_state and implement the new DSA switch port_fast_age operation instead. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:38:50 -04:00
Vivien Didelot	732f794c1b	net: dsa: add port fast ageing Today the DSA drivers are in charge of flushing the MAC addresses associated to a port when its STP state changes from Learning or Forwarding, to Disabled or Blocking or Listening. This makes the drivers more complex and hides the generic switch logic. Introduce a new optional port_fast_age operation to dsa_switch_ops, to move this logic to the DSA layer and keep drivers simple. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:38:50 -04:00
Vivien Didelot	4acfee8143	net: dsa: add port STP state helper Add a void helper to set the STP state of a port, checking first if the required routine is provided by the driver. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:38:50 -04:00
Iyappan Subramanian	e3978673f5	drivers: net: xgene: Fix MSS programming Current driver programs static value of MSS in hardware register for TSO offload engine to segment the TCP payload regardless the MSS value provided by network stack. This patch fixes this by programming hardware registers with the stack provided MSS value. Since the hardware has the limitation of having only 4 MSS registers, this patch uses reference count of mss values being used. Signed-off-by: Iyappan Subramanian <isubramanian@apm.com> Signed-off-by: Toan Le <toanle@apm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 08:38:38 -04:00
David Howells	90bd684ded	rxrpc: Should be using ktime_add_ms() not ktime_add_ns() ktime_add_ms() should be used to add the resend time (in ms) rather than ktime_add_ns(). Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-23 13:23:09 +01:00
David Howells	c0d058c21c	rxrpc: Make sure sendmsg() is woken on call completion Make sure that sendmsg() gets woken up if the call it is waiting for completes abnormally. Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-23 13:23:09 +01:00
David Howells	9aff212bd6	rxrpc: Don't send an ACK at the end of service call response transmission Don't send an IDLE ACK at the end of the transmission of the response to a service call. The service end resends DATA packets until the client sends an ACK that hard-acks all the send data. At that point, the call is complete. Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-23 13:23:09 +01:00
David Howells	b24d2891cf	rxrpc: Preset timestamp on Tx sk_buffs Set the timestamp on sk_buffs holding packets to be transmitted before queueing them because the moment the packet is on the queue it can be seen by the retransmission algorithm - which may see a completely random timestamp. If the retransmission algorithm sees such a timestamp, it may retransmit the packet and, in future, tell the congestion management algorithm that the retransmit timer expired. Signed-off-by: David Howells <dhowells@redhat.com>	2016-09-23 13:17:52 +01:00
Colin Ian King	e12934d980	cxgb4: fix signed wrap around when decrementing index idx Change predecrement compare to post decrement compare to avoid an unsigned integer wrap-around comparison when decrementing idx in the while loop. For example, when idx is zero, the current situation will predecrement idx in the while loop, wrapping idx to the maximum signed integer and cause out of bounds reads on rxq_info->msix_tbl[idx]. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:25:16 -04:00
David S. Miller	c7b9e63341	Merge branch 'mlx5-sriov-vlan-push-pop' Saeed Mahameed says: ==================== Mellanox 100G SRIOV offloads vlan push/pop From Or Gerlitz: This series further enhances the SRIOV TC offloads of mlx5 to handle the TC vlan push and pop actions. This serves a common use-case in virtualization systems where the virtual switch add (push) vlan tags to packets sent from VMs and removes (pop) vlan tags before the packet is received by the VM. We use the new E-Switch switchdev mode and the TC vlan action to achieve that also in SW defined SRIOV environments by offloading TC rules that contain this action along with forwarding (TC mirred/redirect action) the packet. In the first patch we add some helpers to access the TC vlan action info by offloading drivers. The next five patches don't add any new functionality, they do some refactoring and cleanups in the current code to be used next. The seventh patch deals with supporting vlans by the mlx5 e-switch in switchdev mode. The eighth patch does the vlan action offload from TC and the last patch adds matching for vlans as typically required by TC flows that involve vlan pop action. The series was applied on top of commit `524605e` "cxgb4: Convert to use simple_open()" ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:19 -04:00
Or Gerlitz	095b6cfd69	net/mlx5e: Add TC vlan match parsing Enhance the parsing of offloaded TC rules matches to handle vlans. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	8b32580df1	net/mlx5e: Add TC vlan action for SRIOV offloads Parse TC vlan actions and set the required elements to allow offloading. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	f5f8247609	net/mlx5: E-Switch, Support VLAN actions in the offloads mode Many virtualization systems use a policy under which a vlan tag is pushed to packets sent by guests, and popped before the packet is forwarded to the VM. The current generation of the mlx5 HW doesn't fully support that on a per flow level. As such, we are addressing the above common use case with the SRIOV e-Switch abilities to push vlan into packets sent by VFs and pop vlan from packets forwarded to VFs. The HW can match on the correct vlan being present in packets forwarded to VFs (eSwitch steering is done before stripping the tag), so this part is offloaded as is. A common practice for vlans is to avoid both push vlan and pop vlan for inter-host VM/VM (east-west) communication because in this case, push on egress cancels out with pop on ingress. For supporting that, we use a global eswitch vlan pop policy, hence allowing guest A to communicate with both remote VM B and local VM C. This works since the HW pops the vlan only if it exists (e.g for C --> A packets but not for B --> A packets). On the slow path, when a VF vport has an offloaded flow which involves pushing vlans, wheres another flow is not currently offloaded, the packets from the 2nd flow seen by the VF representor on the host have vlan. The VF rep driver removes such vlan before calling into the host networking stack. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	8515c581df	net/mlx5e: Refactor retrival of skb from rx completion element (cqe) Factor the relevant code into a static inline helper (skb_from_cqe) doing that. Move the call to napi_gro_receive to be carried out just after mlx5e_complete_rx_cqe returns. Both changes are to be used for the VF representor as well in the next commit. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	776b12b674	net/mlx5: Put elements related to offloaded TC rule in one struct Put the representors related to the source and dest vports and the action in struct mlx5_esw_flow_attr which is used while setting the FDB rule. This patch doesn't change any functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	e33dfe316c	net/mlx5: E-Switch, Allow fine tuning of eswitch vport push/pop vlan The HW can be programmed to push vlan, pop vlan or both. A factorization step towards using the push/pop capabilties in the eswitch offloads mode. This patch doesn't add new functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	bac9b6aa1d	net/mlx5: E-Switch, Set vport representor fields explicitly on registration The structure we use for the eswitch vport representor (mlx5_eswitch_rep) has some fields which are set from upper layers in the driver when they register the rep. Use explicit setting on registration time for them and avoid global memcpy. This patch doesn't add new functionality. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	9deb2241f1	net/mlx5: E-Switch, Set the vport when registering the uplink rep Set the vport value in the PF entry to be that of the uplink so we can use it blindly over the tc / eswitch offload code without translating it each time we deal with the uplink representor. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:12 -04:00
Or Gerlitz	53e89941ba	net_sched: act_vlan: add helper inlines to access tcf_vlan info Needed e.g for offloading drivers to pick the relevant attributes. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:22:11 -04:00
Eric Dumazet	fefa569a9d	net_sched: sch_fq: account for schedule/timers drifts It looks like the following patch can make FQ very precise, even in VM or stressed hosts. It matters at high pacing rates. We take into account the difference between the time that was programmed when last packet was sent, and current time (a drift of tens of usecs is often observed) Add an EWMA of the unthrottle latency to help diagnostics. This latency is the difference between current time and oldest packet in delayed RB-tree. This accounts for the high resolution timer latency, but can be different under stress, as fq_check_throttled() can be opportunistically be called from a dequeue() called after an enqueue() for a different flow. Tested: // Start a 10Gbit flow $ netperf --google-pacing-rate 1250000000 -H lpaa24 -l 10000 -- -K bbr & Before patch : $ sar -n DEV 10 5 \| grep eth0 \| grep Average Average: eth0 17106.04 756876.84 1102.75 1119049.02 0.00 0.00 0.52 After patch : $ sar -n DEV 10 5 \| grep eth0 \| grep Average Average: eth0 17867.00 800245.90 1151.77 1183172.12 0.00 0.00 0.52 A new iproute2 tc can output the 'unthrottle latency' : $ tc -s qd sh dev eth0 \| grep latency 0 gc, 0 highprio, 32490767 throttled, 2382 ns latency Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:19:06 -04:00
Bert Kenward	429baa6f0e	sfc: check async completer is !NULL before calling Add a NULL check before calling asynchronous MCDI completion functions during device removal. Fixes: `7014d7f6` ("sfc: allow asynchronous MCDI without completion function") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 07:18:35 -04:00
David S. Miller	88e4f75900	Merge branch 'sctp-fix-gap-ack-blocks' Marcelo Ricardo Leitner says: ==================== sctp: fix the handling of SACK Gap Ack blocks sctp_acked() is using 32bit arithmetics on 16bits vars, via TSN_lte() macros, which is weird and confusing. Once the offset to ctsn is calculated, all wrapping is already handled and thus to verify the Gap Ack blocks we can just use pure less/big-or-equal than checks. Also, rename gap variable to tsn_offset, so it's more meaningful, as it doesn't point to any gap at all. Even so, I don't think this discrepancy resulted in any practical bug. This patch is a preparation for the next one, which will introduce typecheck() for TSN_lte() macros and would cause a compile error here. Suggested-by: David Laight <David.Laight@ACULAB.COM> Reported-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 06:55:04 -04:00
Marcelo Ricardo Leitner	182691d099	sctp: improve how SSN, TSN and ASCONF serial are compared Make it similar to time_before() macros: - easier to understand - make use of typecheck() to avoid working on unexpected variable types (made the issue on previous patch visible) - for _[lg]te versions, slighly faster, as the compiler used to generate a sequence of cmp/je/cmp/js instructions and now it's sub/test/jle (for _lte): Before, for sctp_outq_sack: if (primary->cacc.changeover_active) { 1f01: 80 b9 84 02 00 00 00 cmpb $0x0,0x284(%rcx) 1f08: 74 6e je 1f78 <sctp_outq_sack+0xe8> u8 clear_cycling = 0; if (TSN_lte(primary->cacc.next_tsn_at_change, sack_ctsn)) { 1f0a: 8b 81 80 02 00 00 mov 0x280(%rcx),%eax return ((s) - (t)) & TSN_SIGN_BIT; } static inline int TSN_lte(__u32 s, __u32 t) { return ((s) == (t)) \|\| (((s) - (t)) & TSN_SIGN_BIT); 1f10: 8b 7d bc mov -0x44(%rbp),%edi 1f13: 39 c7 cmp %eax,%edi 1f15: 74 25 je 1f3c <sctp_outq_sack+0xac> 1f17: 39 f8 cmp %edi,%eax 1f19: 78 21 js 1f3c <sctp_outq_sack+0xac> primary->cacc.changeover_active = 0; After: if (primary->cacc.changeover_active) { 1ee7: 80 b9 84 02 00 00 00 cmpb $0x0,0x284(%rcx) 1eee: 74 73 je 1f63 <sctp_outq_sack+0xf3> u8 clear_cycling = 0; if (TSN_lte(primary->cacc.next_tsn_at_change, sack_ctsn)) { 1ef0: 8b 81 80 02 00 00 mov 0x280(%rcx),%eax 1ef6: 2b 45 b4 sub -0x4c(%rbp),%eax 1ef9: 85 c0 test %eax,%eax 1efb: 7e 26 jle 1f23 <sctp_outq_sack+0xb3> primary->cacc.changeover_active = 0; *_lt() generated pretty much the same code. Tested with gcc (GCC) 6.1.1 20160621. This patch also removes SSN_lte as it is not used and cleanups some comments. Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 06:54:58 -04:00
Marcelo Ricardo Leitner	a3007446e5	sctp: fix the handling of SACK Gap Ack blocks sctp_acked() is using 32bit arithmetics on 16bits vars, via TSN_lte() macros, which is weird and confusing. Once the offset to ctsn is calculated, all wrapping is already handled and thus to verify the Gap Ack blocks we can just use pure less/big-or-equal than checks. Also, rename gap variable to tsn_offset, so it's more meaningful, as it doesn't point to any gap at all. Even so, I don't think this discrepancy resulted in any practical bug. This patch is a preparation for the next one, which will introduce typecheck() for TSN_lte() macros and would cause a compile error here. Suggested-by: David Laight <David.Laight@ACULAB.COM> Reported-by: David Laight <David.Laight@ACULAB.COM> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 06:54:58 -04:00
WANG Cong	21641c2e1f	net_sched: check NULL on error path in route4_change() On error path in route4_change(), 'f' could be NULL, so we should check NULL before calling tcf_exts_destroy(). Fixes: `b9a24bb76b` ("net_sched: properly handle failure case of tcf_exts_init()") Reported-by: kbuild test robot <fengguang.wu@intel.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-09-23 06:51:49 -04:00
David S. Miller	d6989d4bbe	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2016-09-23 06:46:57 -04:00
Pablo Neira Ayuso	4004d5c374	netfilter: nft_lookup: remove superfluous element found check We already checked for !found just a bit before: if (!found) { regs->verdict.code = NFT_BREAK; return; } if (found && set->flags & NFT_SET_MAP) ^^^^^ So this redundant check can just go away. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-23 09:30:48 +02:00
Gao Feng	b9d80f83bf	netfilter: xt_helper: Use sizeof(variable) instead of literal number It's better to use sizeof(info->name)-1 as index to force set the string tail instead of literal number '29'. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-23 09:30:43 +02:00
Gao Feng	7bdc66242d	netfilter: Enhance the codes used to get random once There are some codes which are used to get one random once in netfilter. We could use net_get_random_once to simplify these codes. Signed-off-by: Gao Feng <fgao@ikuai8.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-23 09:30:36 +02:00
Liping Zhang	a20877b5ed	netfilter: nf_tables: check tprot_set first when we use xt.thoff pkt->xt.thoff is not always set properly, but we use it without any check. For payload expr, it will cause wrong results. For nftrace, we may notify the wrong network or transport header to the user space, furthermore, input the following nft rules, warning message will be printed out: # nft add rule arp filter output meta nftrace set 1 WARNING: CPU: 0 PID: 13428 at net/netfilter/nf_tables_trace.c:263 nft_trace_notify+0x4a3/0x5e0 [nf_tables] Call Trace: [<ffffffff813d58ae>] dump_stack+0x63/0x85 [<ffffffff810a4c0b>] __warn+0xcb/0xf0 [<ffffffff810a4d3d>] warn_slowpath_null+0x1d/0x20 [<ffffffffa0589703>] nft_trace_notify+0x4a3/0x5e0 [nf_tables] [ ... ] [<ffffffffa05690a8>] nft_do_chain_arp+0x78/0x90 [nf_tables_arp] [<ffffffff816f4aa2>] nf_iterate+0x62/0x80 [<ffffffff816f4b33>] nf_hook_slow+0x73/0xd0 [<ffffffff81732bbf>] arp_xmit+0x8f/0xb0 [ ... ] [<ffffffff81732d36>] arp_solicit+0x106/0x2c0 So before we use pkt->xt.thoff, check the tprot_set first. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-23 09:30:26 +02:00
Liping Zhang	8dc3c2b86b	netfilter: nf_tables: improve nft payload fast eval There's an off-by-one issue in nft_payload_fast_eval, skb_tail_pointer and ptr + priv->len all point to the last valid address plus 1. So if they are equal, we can still fetch the valid data. It's unnecessary to fall back to nft_payload_eval. Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-23 09:30:16 +02:00
Liping Zhang	2462f3f4a7	netfilter: nf_queue: improve queue range support for bridge family After commit `ac28634456` ("netfilter: bridge: add nf_afinfo to enable queuing to userspace"), we can queue packets to the user space in bridge family. But when the user specify the queue range, packets will be only delivered to the first queue num. Because in nfqueue_hash, we only support ipv4 and ipv6 family. Now add support for bridge family too. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-23 09:30:01 +02:00
Liping Zhang	8061bb5443	netfilter: nft_queue: add _SREG_QNUM attr to select the queue number Currently, the user can specify the queue numbers by _QUEUE_NUM and _QUEUE_TOTAL attributes, this is enough in most situations. But acctually, it is not very flexible, for example: tcp dport 80 mapped to queue0 tcp dport 81 mapped to queue1 tcp dport 82 mapped to queue2 In order to do this thing, we must add 3 nft rules, and more mapping meant more rules ... So take one register to select the queue number, then we can add one simple rule to mapping queues, maybe like this: queue num tcp dport map { 80:0, 81:1, 82:2 ... } Florian Westphal also proposed wider usage scenarios: queue num jhash ip saddr . ip daddr mod ... queue num meta cpu ... queue num meta mark ... The last point is how to load a queue number from sreg, although we can use (u16)&regs->data[reg] to load the queue number, just like nat expr to load its l4port do. But we will cooperate with hash expr, meta cpu, meta mark expr and so on. They all store the result to u32 type, so cast it to u16 pointer and dereference it will generate wrong result in the big endian system. So just keep it simple, we treat queue number as u32 type, although u16 type is already enough. Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-23 09:29:50 +02:00
Laura Garcia Liebana	36b701fae1	netfilter: nf_tables: validate maximum value of u32 netlink attributes Fetch value and validate u32 netlink attribute. This validation is usually required when the u32 netlink attributes are being stored in a field whose size is smaller. This patch revisits `4da449ae1d` ("netfilter: nft_exthdr: Add size check on u8 nft_exthdr attributes"). Fixes: `96518518cc` ("netfilter: add nftables") Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-09-23 09:29:02 +02:00

... 3 4 5 6 7 ...

619785 Commits