Commit Graph

89389 Commits

Author SHA1 Message Date
Jason A. Donenfeld
69b4ed5cbf net: sfc: use skb_list_walk_safe helper for gso segments
This is a straight-forward conversion case for the new function, and
while we're at it, we can remove a null write to skb->next by replacing
it with skb_mark_not_on_list.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 15:19:55 -08:00
Jason A. Donenfeld
90919f1450 net: sunvnet: use skb_list_walk_safe helper for gso segments
This is a straight-forward conversion case for the new function, and
while we're at it, we can remove a null write to skb->next by replacing
it with skb_mark_not_on_list.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 15:19:55 -08:00
Jason A. Donenfeld
9f0722380f net: tg3: use skb_list_walk_safe helper for gso segments
This is a straight-forward conversion case for the new function, and
while we're at it, we can remove a null write to skb->next by replacing
it with skb_mark_not_on_list.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 15:19:55 -08:00
Jason A. Donenfeld
1d7a7438d7 net: r8152: use skb_list_walk_safe helper for gso segments
This is a straight-forward conversion case for the new function, and
while we're at it, we can remove a null write to skb->next by replacing
it with skb_mark_not_on_list.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 15:19:55 -08:00
Jason A. Donenfeld
5643a552d3 net: tap: use skb_list_walk_safe helper for gso segments
This is a straight-forward conversion case for the new function, and
while we're at it, we can remove a null write to skb->next by replacing
it with skb_mark_not_on_list.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 15:19:55 -08:00
Jason A. Donenfeld
dcfea72e79 net: introduce skb_list_walk_safe for skb segment walking
As part of the continual effort to remove direct usage of skb->next and
skb->prev, this patch adds a helper for iterating through the
singly-linked variant of skb lists, which are used for lists of GSO
packet. The name "skb_list_..." has been chosen to match the existing
function, "kfree_skb_list, which also operates on these singly-linked
lists, and the "..._walk_safe" part is the same idiom as elsewhere in
the kernel.

This patch removes the helper from wireguard and puts it into
linux/skbuff.h, while making it a bit more robust for general usage. In
particular, parenthesis are added around the macro argument usage, and it
now accounts for trying to iterate through an already-null skb pointer,
which will simply run the iteration zero times. This latter enhancement
means it can be used to replace both do { ... } while and while (...)
open-coded idioms.

This should take care of these three possible usages, which match all
current methods of iterations.

skb_list_walk_safe(segs, skb, next) { ... }
skb_list_walk_safe(skb, skb, next) { ... }
skb_list_walk_safe(segs, skb, segs) { ... }

Gcc appears to generate efficient code for each of these.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 15:19:54 -08:00
Alex Maftei (amaftei)
17d3b21c7b sfc: move common tx code
Once again, a tiny bit of refactoring was required to stitch the code
together (i.e. adding headers). The moved code deals with managing tx
queues and mappings.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:03 -08:00
Alex Maftei (amaftei)
1751cc365f sfc: move common rx code
The moved code deals with managing rx buffers and queues.
A tiny bit of refactoring was required in other files to stitch the
code together.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:03 -08:00
Alex Maftei (amaftei)
5f99925632 sfc: move event queue management code
Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:03 -08:00
Alex Maftei (amaftei)
37c45a4e33 sfc: move channel interrupt management code
Small code styling fixes included.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:03 -08:00
Alex Maftei (amaftei)
8397548507 sfc: move channel alloc/removal code
Reallocation and copying code is included, as well as some housekeeping
code.
Other files have been patched up a bit to accommodate the changes.

Small code styling fixes included.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:03 -08:00
Alex Maftei (amaftei)
e20ba5b1d1 sfc: move channel start/stop code
Also includes interrupt enabling/disabling code.
Small code styling fixes included.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:03 -08:00
Alex Maftei (amaftei)
768fd2664e sfc: move some channel-related code
Just a handful of function, but also removed many 'static' identifiers
so the code builds. These will, of course, be moved.
Module parameters for IRQ moderation threshold also moved.

Small code styling fixes included.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:03 -08:00
Alex Maftei (amaftei)
f1826756b4 sfc: move struct init and fini code
The hardware monitor code and the reset work queue code were also
moved, with supporting macros and parameters, because they are assigned
to function pointers in the struct.
Small code styling fixes included.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:02 -08:00
Alex Maftei (amaftei)
1eaf99fe0b sfc: move some device reset code
The rest of the reset code will be moved later.
Small code styling fixes included.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:02 -08:00
Alex Maftei (amaftei)
7ec3de4260 sfc: move datapath management code
The code that manages the datapath (starting, stopping, including the
port-related bits) will be common.
Three functions have been added that contain bits from other
functions. These will be moved to their final files in later patches.
Small code styling fixes included.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:02 -08:00
Alex Maftei (amaftei)
473f5ede41 sfc: move mac configuration and status functions
Two small functions with different purposes.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:02 -08:00
Alex Maftei (amaftei)
82c6448402 sfc: move reset workqueue code
Small functions doing work that will be common, related to reset
workqueue management.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:02 -08:00
Alex Maftei (amaftei)
b194045114 sfc: further preparation for code split
Added more arguments for a couple of functions.
Also moved a function to the common header.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:02 -08:00
Alex Maftei (amaftei)
e1253f3910 sfc: add new headers in preparation for code split
New headers contain prototypes of functions that will be common between
ef10 and upcoming driver.
Removed static modifier from the affected functions.
Some function prototypes were removed from existing headers.

Signed-off-by: Alexandru-Mihai Maftei <amaftei@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 13:28:02 -08:00
Eric Dumazet
96cc4b6958 macvlan: do not assume mac_header is set in macvlan_broadcast()
Use of eth_hdr() in tx path is error prone.

Many drivers call skb_reset_mac_header() before using it,
but others do not.

Commit 6d1ccff627 ("net: reset mac header in dev_start_xmit()")
attempted to fix this generically, but commit d346a3fae3
("packet: introduce PACKET_QDISC_BYPASS socket option") brought
back the macvlan bug.

Lets add a new helper, so that tx paths no longer have
to call skb_reset_mac_header() only to get a pointer
to skb->data.

Hopefully we will be able to revert 6d1ccff627
("net: reset mac header in dev_start_xmit()") and save few cycles
in transmit fast path.

BUG: KASAN: use-after-free in __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
BUG: KASAN: use-after-free in mc_hash drivers/net/macvlan.c:251 [inline]
BUG: KASAN: use-after-free in macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
Read of size 4 at addr ffff8880a4932401 by task syz-executor947/9579

CPU: 0 PID: 9579 Comm: syz-executor947 Not tainted 5.5.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0xd4/0x30b mm/kasan/report.c:374
 __kasan_report.cold+0x1b/0x41 mm/kasan/report.c:506
 kasan_report+0x12/0x20 mm/kasan/common.c:639
 __asan_report_load_n_noabort+0xf/0x20 mm/kasan/generic_report.c:145
 __get_unaligned_cpu32 include/linux/unaligned/packed_struct.h:19 [inline]
 mc_hash drivers/net/macvlan.c:251 [inline]
 macvlan_broadcast+0x547/0x620 drivers/net/macvlan.c:277
 macvlan_queue_xmit drivers/net/macvlan.c:520 [inline]
 macvlan_start_xmit+0x402/0x77f drivers/net/macvlan.c:559
 __netdev_start_xmit include/linux/netdevice.h:4447 [inline]
 netdev_start_xmit include/linux/netdevice.h:4461 [inline]
 dev_direct_xmit+0x419/0x630 net/core/dev.c:4079
 packet_direct_xmit+0x1a9/0x250 net/packet/af_packet.c:240
 packet_snd net/packet/af_packet.c:2966 [inline]
 packet_sendmsg+0x260d/0x6220 net/packet/af_packet.c:2991
 sock_sendmsg_nosec net/socket.c:639 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:659
 __sys_sendto+0x262/0x380 net/socket.c:1985
 __do_sys_sendto net/socket.c:1997 [inline]
 __se_sys_sendto net/socket.c:1993 [inline]
 __x64_sys_sendto+0xe1/0x1a0 net/socket.c:1993
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x442639
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 5b 10 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffc13549e08 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000442639
RDX: 000000000000000e RSI: 0000000020000080 RDI: 0000000000000003
RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000403bb0 R14: 0000000000000000 R15: 0000000000000000

Allocated by task 9389:
 save_stack+0x23/0x90 mm/kasan/common.c:72
 set_track mm/kasan/common.c:80 [inline]
 __kasan_kmalloc mm/kasan/common.c:513 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:486
 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:527
 __do_kmalloc mm/slab.c:3656 [inline]
 __kmalloc+0x163/0x770 mm/slab.c:3665
 kmalloc include/linux/slab.h:561 [inline]
 tomoyo_realpath_from_path+0xc5/0x660 security/tomoyo/realpath.c:252
 tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
 tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
 tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
 security_inode_getattr+0xf2/0x150 security/security.c:1222
 vfs_getattr+0x25/0x70 fs/stat.c:115
 vfs_statx_fd+0x71/0xc0 fs/stat.c:145
 vfs_fstat include/linux/fs.h:3265 [inline]
 __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
 __se_sys_newfstat fs/stat.c:375 [inline]
 __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 9389:
 save_stack+0x23/0x90 mm/kasan/common.c:72
 set_track mm/kasan/common.c:80 [inline]
 kasan_set_free_info mm/kasan/common.c:335 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:474
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:483
 __cache_free mm/slab.c:3426 [inline]
 kfree+0x10a/0x2c0 mm/slab.c:3757
 tomoyo_realpath_from_path+0x1a7/0x660 security/tomoyo/realpath.c:289
 tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
 tomoyo_path_perm+0x230/0x430 security/tomoyo/file.c:822
 tomoyo_inode_getattr+0x1d/0x30 security/tomoyo/tomoyo.c:129
 security_inode_getattr+0xf2/0x150 security/security.c:1222
 vfs_getattr+0x25/0x70 fs/stat.c:115
 vfs_statx_fd+0x71/0xc0 fs/stat.c:145
 vfs_fstat include/linux/fs.h:3265 [inline]
 __do_sys_newfstat+0x9b/0x120 fs/stat.c:378
 __se_sys_newfstat fs/stat.c:375 [inline]
 __x64_sys_newfstat+0x54/0x80 fs/stat.c:375
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8880a4932000
 which belongs to the cache kmalloc-4k of size 4096
The buggy address is located 1025 bytes inside of
 4096-byte region [ffff8880a4932000, ffff8880a4933000)
The buggy address belongs to the page:
page:ffffea0002924c80 refcount:1 mapcount:0 mapping:ffff8880aa402000 index:0x0 compound_mapcount: 0
raw: 00fffe0000010200 ffffea0002846208 ffffea00028f3888 ffff8880aa402000
raw: 0000000000000000 ffff8880a4932000 0000000100000001 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff8880a4932300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a4932380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff8880a4932400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                   ^
 ffff8880a4932480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
 ffff8880a4932500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb

Fixes: b863ceb7dd ("[NET]: Add macvlan driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 12:52:33 -08:00
Petr Machata
3971a535b8 mlxsw: spectrum_qdisc: Ignore grafting of invisible FIFO
The following patch will change PRIO to replace a removed Qdisc with an
invisible FIFO, instead of NOOP. mlxsw will see this replacement due to the
graft message that is generated. But because FIFO does not issue its own
REPLACE message, when the graft operation takes place, the Qdisc that mlxsw
tracks under the indicated band is still the old one. The child
handle (0:0) therefore does not match, and mlxsw rejects the graft
operation, which leads to an extack message:

    Warning: Offloading graft operation failed.

Fix by ignoring the invisible children in the PRIO graft handler. The
DESTROY message of the removed Qdisc is going to follow shortly and handle
the removal.

Fixes: 32dc5efc6c ("mlxsw: spectrum: qdiscs: prio: Handle graft command")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 12:45:52 -08:00
Eric Dumazet
90d72256ad gtp: fix bad unlock balance in gtp_encap_enable_socket
WARNING: bad unlock balance detected!
5.5.0-rc5-syzkaller #0 Not tainted
-------------------------------------
syz-executor921/9688 is trying to release lock (sk_lock-AF_INET6) at:
[<ffffffff84bf8506>] gtp_encap_enable_socket+0x146/0x400 drivers/net/gtp.c:830
but there are no more locks to release!

other info that might help us debug this:
2 locks held by syz-executor921/9688:
 #0: ffffffff8a4d8840 (rtnl_mutex){+.+.}, at: rtnl_lock net/core/rtnetlink.c:72 [inline]
 #0: ffffffff8a4d8840 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x405/0xaf0 net/core/rtnetlink.c:5421
 #1: ffff88809304b560 (slock-AF_INET6){+...}, at: spin_lock_bh include/linux/spinlock.h:343 [inline]
 #1: ffff88809304b560 (slock-AF_INET6){+...}, at: release_sock+0x20/0x1c0 net/core/sock.c:2951

stack backtrace:
CPU: 0 PID: 9688 Comm: syz-executor921 Not tainted 5.5.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x197/0x210 lib/dump_stack.c:118
 print_unlock_imbalance_bug kernel/locking/lockdep.c:4008 [inline]
 print_unlock_imbalance_bug.cold+0x114/0x123 kernel/locking/lockdep.c:3984
 __lock_release kernel/locking/lockdep.c:4242 [inline]
 lock_release+0x5f2/0x960 kernel/locking/lockdep.c:4503
 sock_release_ownership include/net/sock.h:1496 [inline]
 release_sock+0x17c/0x1c0 net/core/sock.c:2961
 gtp_encap_enable_socket+0x146/0x400 drivers/net/gtp.c:830
 gtp_encap_enable drivers/net/gtp.c:852 [inline]
 gtp_newlink+0x9fc/0xc60 drivers/net/gtp.c:666
 __rtnl_newlink+0x109e/0x1790 net/core/rtnetlink.c:3305
 rtnl_newlink+0x69/0xa0 net/core/rtnetlink.c:3363
 rtnetlink_rcv_msg+0x45e/0xaf0 net/core/rtnetlink.c:5424
 netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
 rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5442
 netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
 netlink_unicast+0x58c/0x7d0 net/netlink/af_netlink.c:1328
 netlink_sendmsg+0x91c/0xea0 net/netlink/af_netlink.c:1917
 sock_sendmsg_nosec net/socket.c:639 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:659
 ____sys_sendmsg+0x753/0x880 net/socket.c:2330
 ___sys_sendmsg+0x100/0x170 net/socket.c:2384
 __sys_sendmsg+0x105/0x1d0 net/socket.c:2417
 __do_sys_sendmsg net/socket.c:2426 [inline]
 __se_sys_sendmsg net/socket.c:2424 [inline]
 __x64_sys_sendmsg+0x78/0xb0 net/socket.c:2424
 do_syscall_64+0xfa/0x790 arch/x86/entry/common.c:294
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x445d49
Code: e8 bc b7 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b 12 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f8019074db8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000006dac38 RCX: 0000000000445d49
RDX: 0000000000000000 RSI: 0000000020000180 RDI: 0000000000000003
RBP: 00000000006dac30 R08: 0000000000000004 R09: 0000000000000000
R10: 0000000000000008 R11: 0000000000000246 R12: 00000000006dac3c
R13: 00007ffea687f6bf R14: 00007f80190759c0 R15: 20c49ba5e353f7cf

Fixes: e198987e7d ("gtp: fix suspicious RCU usage")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Cc: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 12:42:49 -08:00
yu kuai
e102774588 net: 3com: 3c59x: remove set but not used variable 'mii_reg1'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/3com/3c59x.c: In function ‘vortex_up’:
drivers/net/ethernet/3com/3c59x.c:1551:9: warning: variable
‘mii_reg1’ set but not used [-Wunused-but-set-variable]

It is never used, and so can be removed.

Signed-off-by: yu kuai <yukuai3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 12:40:03 -08:00
Chen-Yu Tsai
f1239d8aa8 net: stmmac: dwmac-sun8i: Allow all RGMII modes
Allow all the RGMII modes to be used. This would allow us to represent
the hardware better in the device tree with RGMII_ID where in most
cases the PHY's internal delay for both RX and TX are used.

Fixes: 9f93ac8d40 ("net-next: stmmac: Add dwmac-sun8i")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 12:31:25 -08:00
Chen-Yu Tsai
52cc73e540 net: stmmac: dwmac-sunxi: Allow all RGMII modes
Allow all the RGMII modes to be used. This would allow us to represent
the hardware better in the device tree with RGMII_ID where in most
cases the PHY's internal delay for both RX and TX are used.

Fixes: af0bd4e9ba ("net: stmmac: sunxi platform extensions for GMAC in Allwinner A20 SoC's")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 12:30:19 -08:00
David S. Miller
94d3997828 mlx5-updates-2020-01-07
This series adds 2 sets of changes to mlx5 driver
 1) Misc updates and cleanups:
 
 1.1) Stack usages warning cleanups and log level reduction
 1.2) Increase the max number of supported rings
 1.3) Support accept TC action on native NIC netdev.
 
 2) Software steering support for multi destination steering rules:
 First three patches from Erez are adding the low level FW command support
 and SW steering infrastructure to create the mult-destination FW tables.
 
 Last four patches from Alex are introducing the needed changes and APIs in
 SW steering to create and manage multi-destination actions and rules.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAl4U0UMACgkQSD+KveBX
 +j77OQgArFBG3REazF6P+ML3zv5jYKELlJ+pMNCTwubD5En3r8FpivharAuBYZNS
 HRn35opDRQbhUYKwigdvWWycWLh7dQt6OCD1/g8+LrzukwLX4odGIiXkR4uS+P4C
 lrFLa1QBjYdWwuCcyn0RZE8B/qxk/5b4Xh3KiBzYkbO7sSpniUg48S/FpTsowvre
 YxFhtjaCO9kQxaxWEkpM5SqcCuONLednWfLY0L2YLGLCZIWfTGNXamAbDgMQMciW
 jZFgYnCXvlkuD7EaZVowjMRdVjlULhxXxlRaaqRXNC1TAU8Q/FlWAljc2434DeUn
 Dr1vGL2n7IRZNANjOPxQL205zMTxqw==
 =PRNu
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2020-01-07' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-01-07

This series adds 2 sets of changes to mlx5 driver
1) Misc updates and cleanups:

1.1) Stack usages warning cleanups and log level reduction
1.2) Increase the max number of supported rings
1.3) Support accept TC action on native NIC netdev.

2) Software steering support for multi destination steering rules:
First three patches from Erez are adding the low level FW command support
and SW steering infrastructure to create the mult-destination FW tables.

Last four patches from Alex are introducing the needed changes and APIs in
SW steering to create and manage multi-destination actions and rules.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-08 12:25:42 -08:00
Eric Dumazet
47240ba0cd net: usb: lan78xx: fix possible skb leak
If skb_linearize() fails, we need to free the skb.

TSO makes skb bigger, and this bug might be the reason
Raspberry Pi 3B+ users had to disable TSO.

Fixes: 55d7de9de6 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: RENARD Pierre-Francois <pfrenard@gmail.com>
Cc: Stefan Wahren <stefan.wahren@i2se.com>
Cc: Woojung Huh <woojung.huh@microchip.com>
Cc: Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 14:09:28 -08:00
YueHaibing
4addbcb387 enetc: Fix inconsistent IS_ERR and PTR_ERR
The proper pointer to be passed as argument is hw
Detected using Coccinelle.

Fixes: 6517798dd3 ("enetc: Make MDIO accessors more generic and export to include/linux/fsl")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:46:42 -08:00
Dan Carpenter
0d6e5bfc9c enetc: Fix an off by one in enetc_setup_tc_txtime()
The priv->tx_ring[] has 16 elements but only priv->num_tx_rings are
set up, the rest are NULL.  This ">" comparison should be ">=" to avoid
a potential crash.

Fixes: 0d08c9ec7d ("enetc: add support time specific departure base on the qos etf")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:46:20 -08:00
Jose Abreu
da29f2d84b net: stmmac: Fixed link does not need MDIO Bus
When using fixed link we don't need the MDIO bus support.

Reported-by: Heiko Stuebner <heiko@sntech.de>
Reported-by: kernelci.org bot <bot@kernelci.org>
Fixes: d3e014ec7d ("net: stmmac: platform: Fix MDIO init for platforms without PHY")
Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com>
Acked-by: Sriram Dash <Sriram.dash@samsung.com>
Tested-by: Patrice Chotard <patrice.chotard@st.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Acked-by: Neil Armstrong <narmstrong@baylibre.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Florian Fainelli <f.fainelli@gmail> # Lamobo R1 (fixed-link + MDIO sub node for roboswitch).
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:40:29 -08:00
Chen Zhou
c68d724826 drivers: net: cisco_hdlc: use __func__ in debug message
Use __func__ to print the function name instead of hard coded string.
BTW, replace printk(KERN_DEBUG, ...) with netdev_dbg.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:33:35 -08:00
Chen Zhou
195234b885 net: ch9200: remove unnecessary return
The return is not needed, remove it.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:30:36 -08:00
Chen Zhou
e64dec834e net: ch9200: use __func__ in debug message
Use __func__ to print the function name instead of hard coded string.

Signed-off-by: Chen Zhou <chenzhou10@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:30:36 -08:00
Jiping Ma
481a7d154c stmmac: debugfs entry name is not be changed when udev rename device name.
Add one notifier for udev changes net device name.
Fixes: b6601323ef9e ("net: stmmac: debugfs entry name is not be changed when udev rename")

Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:26:16 -08:00
Shannon Nelson
6be1a5ce1b ionic: clear compiler warning on hb use before set
Build checks have pointed out that 'hb' can theoretically
be used before set, so let's initialize it and get rid
of the compiler complaint.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:05:06 -08:00
Shannon Nelson
c37d6e3f25 ionic: restrict received packets to mtu size
Make sure the NIC drops packets that are larger than the
specified MTU.

The front end of the NIC will accept packets larger than MTU and
will copy all the data it can to fill up the driver's posted
buffers - if the buffers are not long enough the packet will
then get dropped.  With the Rx SG buffers allocagted as full
pages, we are currently setting up more space than MTU size
available and end up receiving some packets that are larger
than MTU, up to the size of buffers posted.  To be sure the
NIC doesn't waste our time with oversized packets we need to
lie a little in the SG descriptor about how long is the last
SG element.

At dealloc time, we know the allocation was a page, so the
deallocation doesn't care about what length we put in the
descriptor.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:05:06 -08:00
Shannon Nelson
24cfa8c762 ionic: add Rx dropped packet counter
Add a counter for packets dropped by the driver, typically
for bad size or a receive error seen by the device.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:05:06 -08:00
Shannon Nelson
3daca28f15 ionic: drop use of subdevice tags
The subdevice concept is not being used in the driver, so
drop the references to it.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-07 13:05:06 -08:00
Alex Vesker
7ee3f6d248 net/mlx5: DR, Create multiple destination action from dr_create_fte
Until now it was possible to pass a packet to a single destination such
as vport or flow table. With the new support if multiple vports or multiple
tables are provided as destinations, fs_dr will create a multiple
destination table action, this action should replace other destination
actions provided to mlx5dr_create_rule.
Each vport destination can be provided with a reformat actions which
will be done before forwarding the packet to the vport.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:43:02 -08:00
Alex Vesker
b8853c969f net/mlx5: DR, Add support for multiple destination table action
A multiple destination table action allows HW packet duplication
to multiple destinations, this is useful for multicast or mirroring
traffic for debug. Duplicating is done using a FW flow table with
multiple destinations.

The new action creation function, mlx5dr_action_create_mult_dest_tbl
will allow creating a single table to iterate over several dr actions.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:42:49 -08:00
Alex Vesker
aec292ee6f net/mlx5: DR, Align dest FT action creation to API
Function prefix was changed to be similar to other action APIs.
In order to support other FW tables the mlx5_flow_table struct was
replaced with table id and type.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Reviewed-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:42:36 -08:00
Erez Shitrit
988fd6b32d net/mlx5: DR, Pass table flags at creation to lower layer
We need to have the flow-table flags when creation sw-steering tables,
this parameter exists in the layer between fs_core to sw_steering, this
patch gives it to the creation function.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:42:23 -08:00
Erez Shitrit
34583beea4 net/mlx5: DR, Create multi-destination table for SW-steering use
Currently SW steering doesn't have the means to access HW iterators to
support multi-destination (FTEs) flow table entries.

In order to support multi-destination FTEs for port-mirroring, SW
steering will create a dedicated multi-destination FW managed flow table
and FTEs via direct FW commands that we introduced in the previous patch.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:42:10 -08:00
Erez Shitrit
6de03d2dcb net/mlx5: DR, Create FTE entry in the FW from SW-steering
Implement the FW command to setup a FTE (Flow Table Entry) into the FW
managed flow tables.

Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:41:57 -08:00
Alex Vesker
cc78dbd768 net/mlx5: DR, Use attributes struct for FW flow table creation
Instead of using multiple variables use a simple struct. The
number of passed argument was too high after adding the encap
decap support bits arguments used for multiple destination reformat.

Signed-off-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:41:44 -08:00
Parav Pandit
3ed879965c net/mlx5: Use async EQ setup cleanup helpers for multiple EQs
Use helper routines to setup and teardown multiple EQs and reuse the
code in setup, cleanup and error unwinding flows.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:41:30 -08:00
Parav Pandit
7396ae3d1c net/mlx5: Reduce No CQ found log level from warn to debug
In below sequence, a EQE entry arrives for a CQ which is on the path of
being destroyed.

           cpu-0               cpu-1
           ------              -----
mlx5_core_destroy_cq()      mlx5_eq_comp_int()
  mlx5_eq_del_cq()          [..]
    radix_tree_delete()     [..]
  [..]                         mlx5_eq_cq_get() /* Didn't find CQ is
                                                 * a valid case.
                                                 */
  /* destroy CQ in hw */
  mlx5_cmd_exec()

This is still a valid scenario and correct delete CQ sequence, as
mirror of the CQ create sequence.
Hence, suppress the non harmful debug message from warn to debug level.
Keep the debug log message rate limited because user application can
trigger it repeatedly.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:41:17 -08:00
Fan Li
57c7fce14b net/mlx5: Increase the max number of channels to 128
Currently the max number of channels is limited to 64, which is half of
the indirection table size to allow some flexibility. But on servers
with more than 64 cores, users may want to utilize more queues.

This patch increases the advertised max number of channels to 128 by
changing the ratio between channels and indirection table slots to 1:1.
At the same time, the driver still enable no more than 64 channels at
loading. Users can change it by ethtool afterwards.

Signed-off-by: Fan Li <fanl@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:41:03 -08:00
Tonghao Zhang
15fc92ec3a net/mlx5e: Support accept action on nic table
In one case, we may forward packets from one vport
to others, but only one packets flow will be accepted,
which destination ip was assign to VF.

+-----+     +-----+            +-----+
| VFn |     | VF1 |            | VF0 | accept
+--+--+     +--+--+  hairpin   +--^--+
   |           | <--------------- |
   |           |                  |
+--+-----------v-+             +--+-------------+
|   eswitch PF1  |             |   eswitch PF0  |
+----------------+             +----------------+

tc filter add dev $PF0 protocol all parent ffff: prio 1 handle 1 \
	flower skip_sw action mirred egress redirect dev $VF0_REP
tc filter add dev $VF0 protocol ip  parent ffff: prio 1 handle 1 \
	flower skip_sw dst_ip $VF0_IP action pass
tc filter add dev $VF0 protocol all parent ffff: prio 2 handle 2 \
	flower skip_sw action mirred egress redirect dev $VF1

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Reviewed-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:40:50 -08:00
Arnd Bergmann
42ae1a5c76 mlx5: work around high stack usage with gcc
In some configurations, gcc tries too hard to optimize this code:

drivers/net/ethernet/mellanox/mlx5/core/en_stats.c: In function 'mlx5e_grp_sw_update_stats':
drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:302:1: error: the frame size of 1336 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

As was stated in the bug report, the reason is that gcc runs into a corner
case in the register allocator that is rather hard to fix in a good way.

As there is an easy way to work around it, just add a comment and the
barrier that stops gcc from trying to overoptimize the function.

Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92657
Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:40:36 -08:00
Zhu Yanjun
8007880a2c net/mlx5: limit the function in local scope
The function mlx5_buf_alloc_node is only used by the function in the
local scope. So it is appropriate to limit this function in the local
scope.

Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-07 10:40:22 -08:00
David S. Miller
5528e0d7f1 Merge branch '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue
Jeff Kirsher says:

====================
1GbE Intel Wired LAN Driver Updates 2020-01-06

This series contains updates to igc to add basic support for
timestamping.

Vinicius adds basic support for timestamping and enables ptp4l/phc2sys
to work with i225 devices.  Initially, adds the ability to read and
adjust the PHC clock.  Patches 2 & 3 enable and retrieve hardware
timestamps.  Patch 4 implements the ethtool ioctl that ptp4l uses to
check what timestamping methods are supported.  Lastly, added support to
do timestamping using the "Start of Packet" signal from the PHY, which
is now supported in i225 devices.

While i225 does support multiple PTP domains, with multiple timestamping
registers, we currently only support one PTP domain and use only one of
the timestamping registers for implementation purposes.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 18:44:48 -08:00
Andrew Lunn
8ddf0b5693 net: dsa: mv88e6xxx: Unique ATU and VTU IRQ names
Dynamically generate a unique interrupt name for the VTU and ATU,
based on the device name.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 18:30:15 -08:00
Andrew Lunn
06acd1148b net: dsa: mv88e6xxx: Unique g2 IRQ name
Dynamically generate a unique g2 interrupt name, based on the
device name.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 18:30:14 -08:00
Andrew Lunn
8b4db28914 net: dsa: mv88e6xxx: Unique watchdog IRQ name
Dynamically generate a unique watchdog interrupt name, based on the
device name.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 18:30:14 -08:00
Andrew Lunn
e6f2f6b824 net: dsa: mv88e6xxx: Unique SERDES interrupt names
Dynamically generate a unique SERDES interrupt name, based on the
device name and the port the SERDES is for. For example:

 95:          3  mv88e6xxx-g2   9 Edge      mv88e6xxx-0.2:00-serdes-9
 96:          0  mv88e6xxx-g2  10 Edge      mv88e6xxx-0.2:00-serdes-10

The 0.2:00 indicates the switch and -9 indicates port 9.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 18:30:14 -08:00
Andrew Lunn
3095383a8a net: dsa: mv88e6xxx: Unique IRQ name
Dynamically generate a unique switch interrupt name, based on the
device name.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 18:30:14 -08:00
Erez Shitrit
df55c5586e net/mlx5: DR, Init lists that are used in rule's member
Whenever adding new member of rule object we attach it to 2 lists,
These 2 lists should be initialized first.

Fixes: 41d0707415 ("net/mlx5: DR, Expose steering rule functionality")
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-06 15:30:05 -08:00
Eli Cohen
6412bb396a net/mlx5e: Fix hairpin RSS table size
Set hairpin table size to the corret size, based on the groups that
would be created in it. Groups are laid out on the table such that a
group occupies a range of entries in the table. This implies that the
group ranges should have correspondence to the table they are laid upon.

The patch cited below  made group 1's size to grow hence causing
overflow of group range laid on the table.

Fixes: a795d8db2a ("net/mlx5e: Support RSS for IP-in-IP and IPv6 tunneled packets")
Signed-off-by: Eli Cohen <eli@mellanox.com>
Signed-off-by: Mark Bloch <markb@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-06 15:30:05 -08:00
Yevgeny Kliteynik
4ce380ca47 net/mlx5: DR, No need for atomic refcount for internal SW steering resources
No need for an atomic refcounter for the STE and hashtables.
These are internal SW steering resources and they are always
under domain mutex.

This also fixes the following refcount error:
  refcount_t: addition on 0; use-after-free.
  WARNING: CPU: 9 PID: 3527 at lib/refcount.c:25 refcount_warn_saturate+0x81/0xe0
  Call Trace:
   dr_table_init_nic+0x10d/0x110 [mlx5_core]
   mlx5dr_table_create+0xb4/0x230 [mlx5_core]
   mlx5_cmd_dr_create_flow_table+0x39/0x120 [mlx5_core]
   __mlx5_create_flow_table+0x221/0x5f0 [mlx5_core]
   esw_create_offloads_fdb_tables+0x180/0x5a0 [mlx5_core]
   ...

Fixes: 26d688e33f ("net/mlx5: DR, Add Steering entry (STE) utilities")
Signed-off-by: Yevgeny Kliteynik <kliteyn@mellanox.com>
Reviewed-by: Alex Vesker <valex@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-06 15:30:04 -08:00
Parav Pandit
1f0593e791 Revert "net/mlx5: Support lockless FTE read lookups"
This reverts commit 7dee607ed0.

During cleanup path, FTE's parent node group is removed which is
referenced by the FTE while freeing the FTE.
Hence FTE's lockless read lookup optimization done in cited commit is
not possible at the moment.

Hence, revert the commit.

This avoid below KAZAN call trace.

[  110.390896] BUG: KASAN: use-after-free in find_root.isra.14+0x56/0x60
[mlx5_core]
[  110.391048] Read of size 4 at addr ffff888c19e6d220 by task
swapper/12/0

[  110.391219] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 5.5.0-rc1+
[  110.391222] Hardware name: HP ProLiant DL380p Gen8, BIOS P70
08/02/2014
[  110.391225] Call Trace:
[  110.391229]  <IRQ>
[  110.391246]  dump_stack+0x95/0xd5
[  110.391307]  ? find_root.isra.14+0x56/0x60 [mlx5_core]
[  110.391320]  print_address_description.constprop.5+0x20/0x320
[  110.391379]  ? find_root.isra.14+0x56/0x60 [mlx5_core]
[  110.391435]  ? find_root.isra.14+0x56/0x60 [mlx5_core]
[  110.391441]  __kasan_report+0x149/0x18c
[  110.391499]  ? find_root.isra.14+0x56/0x60 [mlx5_core]
[  110.391504]  kasan_report+0x12/0x20
[  110.391511]  __asan_report_load4_noabort+0x14/0x20
[  110.391567]  find_root.isra.14+0x56/0x60 [mlx5_core]
[  110.391625]  del_sw_fte_rcu+0x4a/0x100 [mlx5_core]
[  110.391633]  rcu_core+0x404/0x1950
[  110.391640]  ? rcu_accelerate_cbs_unlocked+0x100/0x100
[  110.391649]  ? run_rebalance_domains+0x201/0x280
[  110.391654]  rcu_core_si+0xe/0x10
[  110.391661]  __do_softirq+0x181/0x66c
[  110.391670]  irq_exit+0x12c/0x150
[  110.391675]  smp_apic_timer_interrupt+0xf0/0x370
[  110.391681]  apic_timer_interrupt+0xf/0x20
[  110.391684]  </IRQ>
[  110.391695] RIP: 0010:cpuidle_enter_state+0xfa/0xba0
[  110.391703] Code: 3d c3 9b b5 50 e8 56 75 6e fe 48 89 45 c8 0f 1f 44
00 00 31 ff e8 a6 94 6e fe 45 84 ff 0f 85 f6 02 00 00 fb 66 0f 1f 44 00
00 <45> 85 f6 0f 88 db 06 00 00 4d 63 fe 4b 8d 04 7f 49 8d 04 87 49 8d
[  110.391706] RSP: 0018:ffff888c23a6fce8 EFLAGS: 00000246 ORIG_RAX:
ffffffffffffff13
[  110.391712] RAX: dffffc0000000000 RBX: ffffe8ffff7002f8 RCX:
000000000000001f
[  110.391715] RDX: 1ffff11184ee6cb5 RSI: 0000000040277d83 RDI:
ffff888c277365a8
[  110.391718] RBP: ffff888c23a6fd40 R08: 0000000000000002 R09:
0000000000035280
[  110.391721] R10: ffff888c23a6fc80 R11: ffffed11847485d0 R12:
ffffffffb1017740
[  110.391723] R13: 0000000000000003 R14: 0000000000000003 R15:
0000000000000000
[  110.391732]  ? cpuidle_enter_state+0xea/0xba0
[  110.391738]  cpuidle_enter+0x4f/0xa0
[  110.391747]  call_cpuidle+0x6d/0xc0
[  110.391752]  do_idle+0x360/0x430
[  110.391758]  ? arch_cpu_idle_exit+0x40/0x40
[  110.391765]  ? complete+0x67/0x80
[  110.391771]  cpu_startup_entry+0x1d/0x20
[  110.391779]  start_secondary+0x2f3/0x3c0
[  110.391784]  ? set_cpu_sibling_map+0x2500/0x2500
[  110.391795]  secondary_startup_64+0xa4/0xb0

[  110.391841] Allocated by task 290:
[  110.391917]  save_stack+0x21/0x90
[  110.391921]  __kasan_kmalloc.constprop.8+0xa7/0xd0
[  110.391925]  kasan_kmalloc+0x9/0x10
[  110.391929]  kmem_cache_alloc_trace+0xf6/0x270
[  110.391987]  create_root_ns.isra.36+0x58/0x260 [mlx5_core]
[  110.392044]  mlx5_init_fs+0x5fd/0x1ee0 [mlx5_core]
[  110.392092]  mlx5_load_one+0xc7a/0x3860 [mlx5_core]
[  110.392139]  init_one+0x6ff/0xf90 [mlx5_core]
[  110.392145]  local_pci_probe+0xde/0x190
[  110.392150]  work_for_cpu_fn+0x56/0xa0
[  110.392153]  process_one_work+0x678/0x1140
[  110.392157]  worker_thread+0x573/0xba0
[  110.392162]  kthread+0x341/0x400
[  110.392166]  ret_from_fork+0x1f/0x40

[  110.392218] Freed by task 2742:
[  110.392288]  save_stack+0x21/0x90
[  110.392292]  __kasan_slab_free+0x137/0x190
[  110.392296]  kasan_slab_free+0xe/0x10
[  110.392299]  kfree+0x94/0x250
[  110.392357]  tree_put_node+0x257/0x360 [mlx5_core]
[  110.392413]  tree_remove_node+0x63/0xb0 [mlx5_core]
[  110.392469]  clean_tree+0x199/0x240 [mlx5_core]
[  110.392525]  mlx5_cleanup_fs+0x76/0x580 [mlx5_core]
[  110.392572]  mlx5_unload+0x22/0xc0 [mlx5_core]
[  110.392619]  mlx5_unload_one+0x99/0x260 [mlx5_core]
[  110.392666]  remove_one+0x61/0x160 [mlx5_core]
[  110.392671]  pci_device_remove+0x10b/0x2c0
[  110.392677]  device_release_driver_internal+0x1e4/0x490
[  110.392681]  device_driver_detach+0x36/0x40
[  110.392685]  unbind_store+0x147/0x200
[  110.392688]  drv_attr_store+0x6f/0xb0
[  110.392693]  sysfs_kf_write+0x127/0x1d0
[  110.392697]  kernfs_fop_write+0x296/0x420
[  110.392702]  __vfs_write+0x66/0x110
[  110.392707]  vfs_write+0x1a0/0x500
[  110.392711]  ksys_write+0x164/0x250
[  110.392715]  __x64_sys_write+0x73/0xb0
[  110.392720]  do_syscall_64+0x9f/0x3a0
[  110.392725]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 7dee607ed0 ("net/mlx5: Support lockless FTE read lookups")
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-06 15:30:04 -08:00
Michael Guralnik
a6f3b62386 net/mlx5: Move devlink registration before interfaces load
Register devlink before interfaces are added.
This will allow interfaces to use devlink while initalizing. For example,
call mlx5_is_roce_enabled.

Fixes: aba25279c1 ("net/mlx5e: Add TX reporter support")
Signed-off-by: Michael Guralnik <michaelgur@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-06 15:30:04 -08:00
Eran Ben Elisha
99cda45426 net/mlx5e: Always print health reporter message to dmesg
In case a reporter exists, error message is logged only to the devlink
tracer. The devlink tracer is a visibility utility only, which user can
choose not to monitor.
After cited patch, 3rd party monitoring tools that tracks these error
message will no longer find them in dmesg, causing a regression.

With this patch, error messages are also logged into the dmesg.

Fixes: c50de4af1d ("net/mlx5e: Generalize tx reporter's functionality")
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-06 15:30:04 -08:00
Dmytro Linkin
554fe75c1b net/mlx5e: Avoid duplicating rule destinations
Following scenario easily break driver logic and crash the kernel:
1. Add rule with mirred actions to same device.
2. Delete this rule.
In described scenario rule is not added to database and on deletion
driver access invalid entry.
Example:

 $ tc filter add dev ens1f0_0 ingress protocol ip prio 1 \
       flower skip_sw \
       action mirred egress mirror dev ens1f0_1 pipe \
       action mirred egress redirect dev ens1f0_1
 $ tc filter del dev ens1f0_0 ingress protocol ip prio 1

Dmesg output:

[  376.634396] mlx5_core 0000:82:00.0: mlx5_cmd_check:756:(pid 3439): DESTROY_FLOW_GROUP(0x934) op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x563e2f)
[  376.654983] mlx5_core 0000:82:00.0: del_hw_flow_group:567:(pid 3439): flow steering can't destroy fg 89 of ft 3145728
[  376.673433] kasan: CONFIG_KASAN_INLINE enabled
[  376.683769] kasan: GPF could be caused by NULL-ptr deref or user memory access
[  376.695229] general protection fault: 0000 [#1] PREEMPT SMP KASAN PTI
[  376.705069] CPU: 7 PID: 3439 Comm: tc Not tainted 5.4.0-rc5+ #76
[  376.714959] Hardware name: Supermicro SYS-2028TP-DECTR/X10DRT-PT, BIOS 2.0a 08/12/2016
[  376.726371] RIP: 0010:mlx5_del_flow_rules+0x105/0x960 [mlx5_core]
[  376.735817] Code: 01 00 00 00 48 83 eb 08 e8 28 d9 ff ff 4c 39 e3 75 d8 4c 8d bd c0 02 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 84 04 00 00 48 8d 7d 28 8b 9 d
[  376.761261] RSP: 0018:ffff888847c56db8 EFLAGS: 00010202
[  376.770054] RAX: dffffc0000000000 RBX: ffff8888582a6da0 RCX: ffff888847c56d60
[  376.780743] RDX: 0000000000000058 RSI: 0000000000000008 RDI: 0000000000000282
[  376.791328] RBP: 0000000000000000 R08: fffffbfff0c60ea6 R09: fffffbfff0c60ea6
[  376.802050] R10: fffffbfff0c60ea5 R11: ffffffff8630752f R12: ffff8888582a6da0
[  376.812798] R13: dffffc0000000000 R14: ffff8888582a6da0 R15: 00000000000002c0
[  376.823445] FS:  00007f675f9a8840(0000) GS:ffff88886d200000(0000) knlGS:0000000000000000
[  376.834971] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  376.844179] CR2: 00000000007d9640 CR3: 00000007d3f26003 CR4: 00000000001606e0
[  376.854843] Call Trace:
[  376.868542]  __mlx5_eswitch_del_rule+0x49/0x300 [mlx5_core]
[  376.877735]  mlx5e_tc_del_fdb_flow+0x6ec/0x9e0 [mlx5_core]
[  376.921549]  mlx5e_flow_put+0x2b/0x50 [mlx5_core]
[  376.929813]  mlx5e_delete_flower+0x5b6/0xbd0 [mlx5_core]
[  376.973030]  tc_setup_cb_reoffload+0x29/0xc0
[  376.980619]  fl_reoffload+0x50a/0x770 [cls_flower]
[  377.015087]  tcf_block_playback_offloads+0xbd/0x250
[  377.033400]  tcf_block_setup+0x1b2/0xc60
[  377.057247]  tcf_block_offload_cmd+0x195/0x240
[  377.098826]  tcf_block_offload_unbind+0xe7/0x180
[  377.107056]  __tcf_block_put+0xe5/0x400
[  377.114528]  ingress_destroy+0x3d/0x60 [sch_ingress]
[  377.122894]  qdisc_destroy+0xf1/0x5a0
[  377.129993]  qdisc_graft+0xa3d/0xe50
[  377.151227]  tc_get_qdisc+0x48e/0xa20
[  377.165167]  rtnetlink_rcv_msg+0x35d/0x8d0
[  377.199528]  netlink_rcv_skb+0x11e/0x340
[  377.219638]  netlink_unicast+0x408/0x5b0
[  377.239913]  netlink_sendmsg+0x71b/0xb30
[  377.267505]  sock_sendmsg+0xb1/0xf0
[  377.273801]  ___sys_sendmsg+0x635/0x900
[  377.312784]  __sys_sendmsg+0xd3/0x170
[  377.338693]  do_syscall_64+0x95/0x460
[  377.344833]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  377.352321] RIP: 0033:0x7f675e58e090

To avoid this, for every mirred action check if output device was
already processed. If so - drop rule with EOPNOTSUPP error.

Signed-off-by: Dmytro Linkin <dmitrolin@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-01-06 15:30:03 -08:00
Vinicius Costa Gomes
a299df3524 igc: Use Start of Packet signal from PHY for timestamping
For better accuracy, i225 is able to do timestamping using the Start of
Packet signal from the PHY.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-06 15:02:45 -08:00
Vinicius Costa Gomes
60dbede0c4 igc: Add support for ethtool GET_TS_INFO command
This command allows igc to report what types of timestamping are
supported. ptp4l uses this to detect if the hardware supports
timestamping.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-06 14:59:48 -08:00
Vinicius Costa Gomes
2c344ae245 igc: Add support for TX timestamping
This adds support for timestamping packets being transmitted.

Based on the code from i210. The basic differences is that i225 has 4
registers to store the transmit timestamps (i210 has one). Right now,
we only support retrieving from one register, support for using the
other registers will be added later.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-06 14:39:17 -08:00
Vinicius Costa Gomes
81b055205e igc: Add support for RX timestamping
This adds support for timestamping received packets.

It is based on the i210, as many features of i225 work the same way.
The main difference from i210 is that i225 has support for choosing
the timer register to use when timestamping packets. Right now, we
only support using timer 0. The other difference is that i225 stores
two timestamps in the receive descriptor, right now, we only retrieve
one.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-06 14:19:31 -08:00
Igor Russkikh
b585f8602a net: atlantic: remove duplicate entries
Function entries were duplicated accidentally, removing the dups.

Fixes: ea4b4d7fc1 ("net: atlantic: loopback tests via private flags")
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 14:06:11 -08:00
Igor Russkikh
883daa1854 net: atlantic: loopback configuration in improper place
Initial loopback configuration should be called earlier, before
starting traffic on HW blocks. Otherwise depending on race conditions
it could be kept disabled.

Fixes: ea4b4d7fc1 ("net: atlantic: loopback tests via private flags")
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 14:06:11 -08:00
Igor Russkikh
ac70957ee1 net: atlantic: broken link status on old fw
Last code/checkpatch cleanup did a copy paste error where code from
firmware 3 API logic was moved to firmware 1 logic.

This resulted in FW1.x users would never see the link state as active.

Fixes: 7b0c342f1f ("net: atlantic: code style cleanup")
Signed-off-by: Igor Russkikh <irusskikh@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 14:06:11 -08:00
Michal Kubecek
4ac0ac847f epic100: allow nesting of ethtool_ops begin() and complete()
Unlike most networking drivers using begin() and complete() ethtool_ops
callbacks to resume a device which is down and suspend it again when done,
epic100 does not use standard refcounted infrastructure but sets device
sleep state directly.

With the introduction of netlink ethtool interface, we may have nested
begin-complete blocks so that inner complete() would put the device back to
sleep for the rest of the outer block.

To avoid rewriting an old and not very actively developed driver, just add
a nesting counter and only perform resume and suspend on the outermost
level.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:54:55 -08:00
Michal Kubecek
71f711a4f1 via-velocity: allow nesting of ethtool_ops begin() and complete()
Unlike most networking drivers using begin() and complete() ethtool_ops
callbacks to resume a device which is down and suspend it again when done,
via-velocity does not use standard refcounted infrastructure but sets
device sleep state directly.

With the introduction of netlink ethtool interface, we may have nested
begin-complete blocks so that inner complete() would put the device back to
sleep for the rest of the outer block.

To avoid rewriting an old and not very actively developed driver, just add
a nesting counter and only perform resume and suspend on the outermost
level.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:54:55 -08:00
Michal Kubecek
a69faa0910 wil6210: get rid of begin() and complete() ethtool_ops
The wil6210 driver locks a mutex in begin() ethtool_ops callback and
unlocks it in complete() so that all ethtool requests are serialized. This
is not going to work correctly with netlink interface; e.g. when ioctl
triggers a netlink notification, netlink code would call begin() again
while the mutex taken by ioctl code is still held by the same task.

Let's get rid of the begin() and complete() callbacks and move the mutex
locking into the remaining ethtool_ops handlers except get_drvinfo which
only copies strings that are not changing so that there is no need for
serialization.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:54:55 -08:00
Christophe JAILLET
b289ba5e07 gtp: simplify error handling code in 'gtp_encap_enable()'
'gtp_encap_disable_sock(sk)' handles the case where sk is NULL, so there
is no need to test it before calling the function.

This saves a few line of code.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:39:35 -08:00
Amit Cohen
ca360db4b8 mlxsw: spectrum: Disable DIP_LINK_LOCAL check in hardware pipeline
The check drops packets if they need to be routed and their destination
IP is link-local, i.e., belongs to 169.254.0.0/16 address range.

Disable the check since the kernel forwards such packets and does not
drop them.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:38:37 -08:00
Amit Cohen
e317b0f77e mlxsw: spectrum: Disable SIP_DIP check in hardware pipeline
The check drops packets if they need to be routed and their source IP
equals to their destination IP.

Disable the check since the kernel forwards such packets and does not
drop them.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:38:36 -08:00
Amit Cohen
359ec56679 mlxsw: spectrum: Disable MC_DMAC check in hardware pipeline
The check drops packets if they need to be routed and their multicast
MAC mismatched to their multicast destination IP.

For IPV4:
DMAC is mismatched if it is different from {01-00-5E-0 (25 bits),
DIP[22:0]}

For IPV6:
DMAC is mismatched if it is different from {33-33-0 (16 bits),
DIP[31:0]}

Disable the check since the kernel forwards such packets and does not
drop them.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:38:36 -08:00
Amit Cohen
62b0fb099c mlxsw: spectrum: Disable SIP_CLASS_E check in hardware pipeline
The check drops packets if they need to be routed and their source IP is
from class E, i.e., belongs to 240.0.0.0/4 address range, but different
from 255.255.255.255.

Disable the check since the kernel forwards such packets and does not
drop them.

Signed-off-by: Amit Cohen <amitc@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:38:36 -08:00
Andrew Lunn
d8dc2c9676 net: dsa: mv88e6xxx: Preserve priority when setting CPU port.
The 6390 family uses an extended register to set the port connected to
the CPU. The lower 5 bits indicate the port, the upper three bits are
the priority of the frames as they pass through the switch, what
egress queue they should use, etc. Since frames being set to the CPU
are typically management frames, BPDU, IGMP, ARP, etc set the priority
to 7, the reset default, and the highest.

Fixes: 33641994a6 ("net: dsa: mv88e6xxx: Monitor and Management tables")
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:35:11 -08:00
Krzysztof Kozlowski
5adcb8b186 net: ethernet: sxgbe: Rename Samsung to lowercase
Fix up inconsistent usage of upper and lowercase letters in "Samsung"
name.

"SAMSUNG" is not an abbreviation but a regular trademarked name.
Therefore it should be written with lowercase letters starting with
capital letter.

Although advertisement materials usually use uppercase "SAMSUNG", the
lowercase version is used in all legal aspects (e.g. on Wikipedia and in
privacy/legal statements on
https://www.samsung.com/semiconductor/privacy-global/).

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:33:14 -08:00
Vinicius Costa Gomes
5f2958052c igc: Add basic skeleton for PTP
This allows the creation of the /dev/ptpX device for i225, and reading
and writing the time.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2020-01-06 13:33:01 -08:00
Krzysztof Kozlowski
00c0688cec net: wan: sdla: Fix cast from pointer to integer of different size
Since net_device.mem_start is unsigned long, it should not be cast to
int right before casting to pointer.  This fixes warning (compile
testing on alpha architecture):

    drivers/net/wan/sdla.c: In function ‘sdla_transmit’:
    drivers/net/wan/sdla.c:711:13: warning:
        cast to pointer from integer of different size [-Wint-to-pointer-cast]

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:30:03 -08:00
Huazhong Tan
7f39febf2e net: hns3: modify an unsuitable reset level for hardware error
According to hardware user manual, when hardware reports error
'roc_pkt_without_key_port', the driver should assert function
reset to do the recovery.

So this patch uses HNAE3_FUNC_RESET to replace HNAE3_GLOBAL_RESET.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:26:25 -08:00
Huazhong Tan
7061867b59 net: hns3: replace an unsuitable variable type in hclge_inform_reset_assert_to_vf()
In hclge_inform_reset_assert_to_vf(), variable reset_type(enum type)
will be copied into msg_data whose size is 2 bytes. Currently, hip08
is a little-endian machine, so the lower two bytes of reset_type will
be copied to msg_data. But when running on a big-endian machine,
msg_data will have a wrong value(the higher two bytes of reset_type).

So this patch modifies the type of reset_type to u16, and adds a
build check in case enum hnae3_reset_type has value larger than
U16_MAX.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:26:25 -08:00
Guojia Liao
2af8cb6126 net: hns3: add protection when get SFP speed as 0
In some case, the MAC speed get from hardware maybe 0, it should
not be set to mac->speed.

Signed-off-by: Guojia Liao <liaoguojia@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:26:25 -08:00
Yonglong Liu
f97c4d823f net: hns3: modify the IRQ name of misc vectors
The misc IRQ of all the devices have the same name, so it's
hard to find the right misc IRQ of the device.

This patch modifies the misc IRQ names as "hclge/hclgevf"-misc-
"pci name". And now the IRQ name is not related to net device
name anymore, so change the HNAE3_INT_NAME_LEN to 32 bytes, and
that is enough.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:26:25 -08:00
Yonglong Liu
7ab2b53e46 net: hns3: modify an unsuitable log in hclge_map_ring_to_vector()
When the returned vector_id less than 0, the message should print
out the vector who is getting vector index fail.

So this patch replaces vector_id with vector, and re-format the
message.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:26:25 -08:00
Yonglong Liu
5bffde62a1 net: hns3: modify the IRQ name of TQP vector
When rename the net devices, the IRQ number can not be
fetched by the net device name, because the driver request
the IRQ resources only when the vector resource changed, and
the rename operation did not change the vector resources,
so the IRQ name keeps the previous net device name.
So this patch modifies the name of the TQP IRQ as
"pci driver name"-"pci name"-"TxRx"-"index".

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:26:25 -08:00
Yonglong Liu
08a100689d net: hns3: re-organize vector handle
To prevent loss user's IRQ affinity configuration when DOWN,
this patch moves out release/request operation of the vector
handle from net DOWN/UP, just do it when vector resource changes.

Signed-off-by: Yonglong Liu <liuyonglong@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:26:25 -08:00
Yunsheng Lin
698a89541c net: hns3: add trace event support for HNS3 driver
This adds trace support for HNS3 driver. It also declares
some events which could be used to trace the events when a
TX/RX BD is processed, and other events which are related to
the processing of sk_buff, such as TSO, GRO.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-06 13:26:25 -08:00
Vladimir Oltean
bdeced75b1 net: dsa: felix: Add PCS operations for PHYLINK
Layerscape SoCs traditionally expose the SerDes configuration/status for
Ethernet protocols (PCS for SGMII/USXGMII/10GBase-R etc etc) in a register
format that is compatible with clause 22 or clause 45 (depending on
SerDes protocol). Each MAC has its own internal MDIO bus on which there
is one or more of these PCS's, responding to commands at a configurable
PHY address. The per-port internal MDIO bus (which is just for PCSs) is
totally separate and has nothing to do with the dedicated external MDIO
controller (which is just for PHYs), but the register map for the MDIO
controller is the same.

The VSC9959 (Felix) switch instantiated in the LS1028A is integrated
in hardware with the ENETC PCS of its DSA master, and reuses its MDIO
controller driver, so Felix has been made to depend on it in Kconfig.

 +------------------------------------------------------------------------+
 |                   +--------+ GMII (typically disabled via RCW)         |
 | ENETC PCI         |  ENETC |--------------------------+                |
 | Root Complex      | port 3 |-----------------------+  |                |
 | Integrated        +--------+                       |  |                |
 | Endpoint                                           |  |                |
 |                   +--------+ 2.5G GMII             |  |                |
 |                   |  ENETC |--------------+        |  |                |
 |                   | port 2 |-----------+  |        |  |                |
 |                   +--------+           |  |        |  |                |
 |                                     +--------+  +--------+             |
 |                                     |  Felix |  |  Felix |             |
 |                                     | port 4 |  | port 5 |             |
 |                                     +--------+  +--------+             |
 |                                                                        |
 | +--------+  +--------+  +--------+  +--------+  +--------+  +--------+ |
 | |  ENETC |  |  ENETC |  |  Felix |  |  Felix |  |  Felix |  |  Felix | |
 | | port 0 |  | port 1 |  | port 0 |  | port 1 |  | port 2 |  | port 3 | |
 +------------------------------------------------------------------------+
 |    ||||  SerDes |          ||||        ||||        ||||        ||||    |
 | +--------+block |       +--------------------------------------------+ |
 | |  ENETC |      |       |       ENETC port 2 internal MDIO bus       | |
 | | port 0 |      |       |  PCS         PCS          PCS        PCS   | |
 | |   PCS  |      |       |   0           1            2          3    | |
 +-----------------|------------------------------------------------------+
        v          v           v           v            v          v
     SGMII/      RGMII    QSGMII/QSXGMII/4xSGMII/4x1000Base-X/4x2500Base-X
    USXGMII/   (bypasses
  1000Base-X/   SerDes)
  2500Base-X

In the LS1028A SoC described above, the VSC9959 Felix switch is PF5 of
the ENETC root complex, and has 2 BARs:
- BAR 4: the switch's effective registers
- BAR 0: the MDIO controller register map lended from ENETC port 2
         (PF2), for accessing its associated PCS's.

This explanation is necessary because the patch does some renaming
"pci_bar" -> "switch_pci_bar" for clarity, which would otherwise appear
a bit obtuse.

The fact that the internal MDIO bus is "borrowed" is relevant because
the register map is found in PF5 (the switch) but it triggers an access
fault if PF2 (the ENETC DSA master) is not enabled. This is not treated
in any way (and I don't think it can be treated).

All of this is so SoC-specific, that it was contained as much as
possible in the platform-integration file felix_vsc9959.c.

We need to parse and pre-validate the device tree because of 2 reasons:
- The PHY mode (SerDes protocol) cannot change at runtime due to SoC
  design.
- There is a circular dependency in that we need to know what clause the
  PCS speaks in order to find it on the internal MDIO bus. But the
  clause of the PCS depends on what phy-mode it is configured for.

The goal of this patch is to make steps towards removing the bootloader
dependency for SGMII PCS pre-configuration, as well as to add support
for monitoring the in-band SGMII AN between the PCS and the system-side
link partner (PHY or other MAC).

In practice the bootloader dependency is not completely removed. U-Boot
pre-programs the PHY address at which each PCS can be found on the
internal MDIO bus (MDEV_PORT). This is needed because the PCS of each
port has the same out-of-reset PHY address of zero. The SerDes register
for changing MDEV_PORT is pretty deep in the SoC (outside the addresses
of the ENETC PCI BARs) and therefore inaccessible to us from here.

Felix VSC9959 and Ocelot VSC7514 are integrated very differently in
their respective SoCs, and for that reason Felix does not use the Ocelot
core library for PHYLINK. On one hand we don't want to impose the
fixed phy-mode limitation to Ocelot, and on the other hand Felix doesn't
need to force the MAC link speed the way Ocelot does, since the MAC is
connected to the PCS through a fixed GMII, and the PCS is the one who
does the rate adaptation at lower link speeds, which the MAC does not
even need to know about. In fact changing the GMII speed for Felix
irrecoverably breaks transmission through that port until a reset.

The pair with ENETC port 3 and Felix port 5 is optional and doesn't
support tagging. When we enable it, swp5 is a regular slave port, albeit
an internal one. The trouble is that it doesn't work, and that is
because the DSA PHYLIB adaptation layer doesn't treat fixed-link slave
ports. So that is yet another reason for wanting to convert Felix to the
native PHYLINK API.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 23:22:33 -08:00
Vladimir Oltean
964ee5c82b net: mscc: ocelot: export ANA, DEV and QSYS registers to include/soc/mscc
Since the Felix DSA driver is implementing its own PHYLINK instance due
to SoC differences, it needs access to the few registers that are
common, mainly for flow control.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 23:22:33 -08:00
Vladimir Oltean
ee50d07c9f net: mscc: ocelot: make phy_mode a member of the common struct ocelot_port
The Ocelot switchdev driver and the Felix DSA one need it for different
reasons. Felix (or at least the VSC9959 instantiation in NXP LS1028A) is
integrated with the traditional NXP Layerscape PCS design which does not
support runtime configuration of SerDes protocol. So it needs to
pre-validate the phy-mode from the device tree and prevent PHYLINK from
attempting to change it. For this, it needs to cache it in a private
variable.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 23:22:33 -08:00
Vladimir Oltean
d79d30327f enetc: Set MDIO_CFG_HOLD to the recommended value of 2
This increases the MDIO hold time to 5 enet_clk cycles from the previous
value of 0. This is actually the out-of-reset value, that the driver was
previously overwriting with 0. Zero worked for the external MDIO, but
breaks communication with the internal MDIO buses on which the PCS of
ENETC SI's and Felix switch are found.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 23:22:32 -08:00
Claudiu Manoil
6517798dd3 enetc: Make MDIO accessors more generic and export to include/linux/fsl
Within the LS1028A SoC, the register map for the ENETC MDIO controller
is instantiated a few times: for the central (external) MDIO controller,
for the internal bus of each standalone ENETC port, and for the internal
bus of the Felix switch.

Refactoring is needed to support multiple MDIO buses from multiple
drivers. The enetc_hw structure is made an opaque type and a smaller
enetc_mdio_priv is created.

'mdio_base' - MDIO registers base address - is being parameterized, to
be able to work with different MDIO register bases.

The ENETC MDIO bus operations are exported from the fsl-enetc-mdio
kernel object, the same that registers the central MDIO controller (the
dedicated PF). The ENETC main driver has been changed to select it, and
use its exported helpers to further register its private MDIO bus. The
DSA Felix driver will do the same.

Signed-off-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 23:22:32 -08:00
Vladimir Oltean
1511ed0a01 net: phylink: add support for polling MAC PCS
Some MAC PCS blocks are unable to provide interrupts when their status
changes. As we already have support in phylink for polling status, use
this to provide a hook for MACs to enable polling mode.

The patch idea was picked up from Russell King's suggestion on the macb
phylink patch thread here [0] but the implementation was changed.
Instead of introducing a new phylink_start_poll() function, which would
make the implementation cumbersome for common PHYLINK implementations
for multiple types of devices, like DSA, just add a boolean property to
the phylink_config structure, which is just as backwards-compatible.

https://lkml.org/lkml/2019/12/16/603

Suggested-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 23:22:32 -08:00
Vladimir Oltean
3a68ba6fba net: phylink: make QSGMII a valid PHY mode for in-band AN
QSGMII is a SerDes protocol clocked at 5 Gbaud (4 times higher than
SGMII which is clocked at 1.25 Gbaud), with the same 8b/10b encoding and
some extra symbols for synchronization. Logically it offers 4 SGMII
interfaces multiplexed onto the same physical lanes. Each MAC PCS has
its own in-band AN process with the system side of the QSGMII PHY, which
is identical to the regular SGMII AN process.

So allow QSGMII as a valid in-band AN mode, since it is no different
from software perspective from regular SGMII.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 23:22:32 -08:00
Vladimir Oltean
a68578c20a net: dsa: Make deferred_xmit private to sja1105
There are 3 things that are wrong with the DSA deferred xmit mechanism:

1. Its introduction has made the DSA hotpath ever so slightly more
   inefficient for everybody, since DSA_SKB_CB(skb)->deferred_xmit needs
   to be initialized to false for every transmitted frame, in order to
   figure out whether the driver requested deferral or not (a very rare
   occasion, rare even for the only driver that does use this mechanism:
   sja1105). That was necessary to avoid kfree_skb from freeing the skb.

2. Because L2 PTP is a link-local protocol like STP, it requires
   management routes and deferred xmit with this switch. But as opposed
   to STP, the deferred work mechanism needs to schedule the packet
   rather quickly for the TX timstamp to be collected in time and sent
   to user space. But there is no provision for controlling the
   scheduling priority of this deferred xmit workqueue. Too bad this is
   a rather specific requirement for a feature that nobody else uses
   (more below).

3. Perhaps most importantly, it makes the DSA core adhere a bit too
   much to the NXP company-wide policy "Innovate Where It Doesn't
   Matter". The sja1105 is probably the only DSA switch that requires
   some frames sent from the CPU to be routed to the slave port via an
   out-of-band configuration (register write) rather than in-band (DSA
   tag). And there are indeed very good reasons to not want to do that:
   if that out-of-band register is at the other end of a slow bus such
   as SPI, then you limit that Ethernet flow's throughput to effectively
   the throughput of the SPI bus. So hardware vendors should definitely
   not be encouraged to design this way. We do _not_ want more
   widespread use of this mechanism.

Luckily we have a solution for each of the 3 issues:

For 1, we can just remove that variable in the skb->cb and counteract
the effect of kfree_skb with skb_get, much to the same effect. The
advantage, of course, being that anybody who doesn't use deferred xmit
doesn't need to do any extra operation in the hotpath.

For 2, we can create a kernel thread for each port's deferred xmit work.
If the user switch ports are named swp0, swp1, swp2, the kernel threads
will be named swp0_xmit, swp1_xmit, swp2_xmit (there appears to be a 15
character length limit on kernel thread names). With this, the user can
change the scheduling priority with chrt $(pidof swp2_xmit).

For 3, we can actually move the entire implementation to the sja1105
driver.

So this patch deletes the generic implementation from the DSA core and
adds a new one, more adequate to the requirements of PTP TX
timestamping, in sja1105_main.c.

Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-01-05 15:13:13 -08:00