linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-11 08:56:40 +07:00

Author	SHA1	Message	Date
Daniel Borkmann	0ea7eeec24	mm, percpu: add support for __GFP_NOWARN flag Add an option for pcpu_alloc() to support __GFP_NOWARN flag. Currently, we always throw a warning when size or alignment is unsupported (and also dump stack on failed allocation requests). The warning itself is harmless since we return NULL anyway for any failed request, which callers are required to handle anyway. However, it becomes harmful when panic_on_warn is set. The rationale for the WARN() in pcpu_alloc() is that it can be tracked when larger than supported allocation requests are made such that allocations limits can be tweaked if warranted. This makes sense for in-kernel users, however, there are users of pcpu allocator where allocation size is derived from user space requests, e.g. when creating BPF maps. In these cases, the requests should fail gracefully without throwing a splat. The current work-around was to check allocation size against the upper limit of PCPU_MIN_UNIT_SIZE from call-sites for bailing out prior to a call to pcpu_alloc() in order to avoid throwing the WARN(). This is bad in multiple ways since PCPU_MIN_UNIT_SIZE is an implementation detail, and having the checks on call-sites only complicates the code for no good reason. Thus, lets fix it generically by supporting the __GFP_NOWARN flag that users can then use with calling the __alloc_percpu_gfp() helper instead. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Tejun Heo <tj@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-19 13:13:49 +01:00
David S. Miller	3fd3b03b43	Merge branch 'ena-fixes' Netanel Belgazal says: ==================== ENA ethernet driver bug fixes Some fixes for ENA ethernet driver ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-19 12:49:16 +01:00
Netanel Belgazal	a59df39676	net: ena: fix wrong max Tx/Rx queues on ethtool ethtool ena_get_channels() expose the max number of queues as the max number of queues ENA supports (128 queues) and not the actual number of created queues. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-19 12:49:15 +01:00
Netanel Belgazal	411838e7b4	net: ena: fix rare kernel crash when bar memory remap fails This failure is rare and only found on testing where deliberately fail devm_ioremap() [ 451.170464] ena 0000:04:00.0: failed to remap regs bar 451.170549] Workqueue: pciehp-1 pciehp_power_thread [ 451.170551] task: ffff88085a5f2d00 task.stack: ffffc9000756c000 [ 451.170552] RIP: 0010:devm_iounmap+0x2d/0x40 [ 451.170553] RSP: 0018:ffffc9000756fac0 EFLAGS: 00010282 [ 451.170554] RAX: 00000000fffffffe RBX: 0000000000000000 RCX: 0000000000000000 [ 451.170555] RDX: ffffffff813a7e00 RSI: 0000000000000282 RDI: 0000000000000282 [ 451.170556] RBP: ffffc9000756fac8 R08: 00000000fffffffe R09: 00000000000009b7 [ 451.170557] R10: 0000000000000005 R11: 00000000000009b6 R12: ffff880856c9d0a0 [ 451.170558] R13: ffffc9000f5c90c0 R14: ffff880856c9d0a0 R15: 0000000000000028 [ 451.170559] FS: 0000000000000000(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000 [ 451.170560] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 451.170561] CR2: 00007f169038b000 CR3: 0000000001c09000 CR4: 00000000003406f0 [ 451.170562] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 451.170562] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 451.170563] Call Trace: [ 451.170572] ena_release_bars.isra.48+0x34/0x60 [ena] [ 451.170574] ena_probe+0x144/0xd90 [ena] [ 451.170579] ? ida_simple_get+0x98/0x100 [ 451.170585] ? kernfs_next_descendant_post+0x40/0x50 [ 451.170591] local_pci_probe+0x45/0xa0 [ 451.170592] pci_device_probe+0x157/0x180 [ 451.170599] driver_probe_device+0x2a8/0x460 [ 451.170600] __device_attach_driver+0x7e/0xe0 [ 451.170602] ? driver_allows_async_probing+0x30/0x30 [ 451.170603] bus_for_each_drv+0x68/0xb0 [ 451.170605] __device_attach+0xdd/0x160 [ 451.170607] device_attach+0x10/0x20 [ 451.170610] pci_bus_add_device+0x4f/0xa0 [ 451.170611] pci_bus_add_devices+0x39/0x70 [ 451.170613] pciehp_configure_device+0x96/0x120 [ 451.170614] pciehp_enable_slot+0x1b3/0x290 [ 451.170616] pciehp_power_thread+0x3b/0xb0 [ 451.170622] process_one_work+0x149/0x360 [ 451.170623] worker_thread+0x4d/0x3c0 [ 451.170626] kthread+0x109/0x140 [ 451.170627] ? rescuer_thread+0x380/0x380 [ 451.170628] ? kthread_park+0x60/0x60 [ 451.170632] ret_from_fork+0x25/0x30 Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-19 12:49:15 +01:00
Netanel Belgazal	cd7aea1875	net: ena: reduce the severity of some printouts Decrease log level of checksum errors as these messages can be triggered remotely by bad packets. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-19 12:49:15 +01:00
Jakub Kicinski	28e33f9d78	bpf: disallow arithmetic operations on context pointer Commit `f1174f77b5` ("bpf/verifier: rework value tracking") removed the crafty selection of which pointer types are allowed to be modified. This is OK for most pointer types since adjust_ptr_min_max_vals() will catch operations on immutable pointers. One exception is PTR_TO_CTX which is now allowed to be offseted freely. The intent of aforementioned commit was to allow context access via modified registers. The offset passed to ->is_valid_access() verifier callback has been adjusted by the value of the variable offset. What is missing, however, is taking the variable offset into account when the context register is used. Or in terms of the code adding the offset to the value passed to the ->convert_ctx_access() callback. This leads to the following eBPF user code: r1 += 68 r0 = (u32 )(r1 + 8) exit being translated to this in kernel space: 0: (07) r1 += 68 1: (61) r0 = (u32 )(r1 +180) 2: (95) exit Offset 8 is corresponding to 180 in the kernel, but offset 76 is valid too. Verifier will "accept" access to offset 68+8=76 but then "convert" access to offset 8 as 180. Effective access to offset 248 is beyond the kernel context. (This is a __sk_buff example on a debug-heavy kernel - packet mark is 8 -> 180, 76 would be data.) Dereferencing the modified context pointer is not as easy as dereferencing other types, because we have to translate the access to reading a field in kernel structures which is usually at a different offset and often of a different size. To allow modifying the pointer we would have to make sure that given eBPF instruction will always access the same field or the fields accessed are "compatible" in terms of offset and size... Disallow dereferencing modified context pointers and add to selftests the test case described here. Fixes: `f1174f77b5` ("bpf/verifier: rework value tracking") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-18 13:21:13 +01:00
Johannes Berg	48044eb490	netlink: fix netlink_ack() extack race It seems that it's possible to toggle NETLINK_F_EXT_ACK through setsockopt() while another thread/CPU is building a message inside netlink_ack(), which could then trigger the WARN_ON()s I added since if it goes from being turned off to being turned on between allocating and filling the message, the skb could end up being too small. Avoid this whole situation by storing the value of this flag in a separate variable and using that throughout the function instead. Fixes: `2d4bc93368` ("netlink: extended ACK reporting") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-18 12:22:28 +01:00
Thomas Falcon	2de09681e4	ibmvnic: Fix calculation of number of TX header descriptors This patch correctly sets the number of additional header descriptors that will be sent in an indirect SCRQ entry. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-18 12:20:39 +01:00
Ido Schimmel	d965465b60	mlxsw: core: Fix possible deadlock When an EMAD is transmitted, a timeout work item is scheduled with a delay of 200ms, so that another EMAD will be retried until a maximum of five retries. In certain situations, it's possible for the function waiting on the EMAD to be associated with a work item that is queued on the same workqueue (`mlxsw_core`) as the timeout work item. This results in flushing a work item on the same workqueue. According to commit `e159489baa` ("workqueue: relax lockdep annotation on flush_work()") the above may lead to a deadlock in case the workqueue has only one worker active or if the system in under memory pressure and the rescue worker is in use. The latter explains the very rare and random nature of the lockdep splats we have been seeing: [ 52.730240] ============================================ [ 52.736179] WARNING: possible recursive locking detected [ 52.742119] 4.14.0-rc3jiri+ #4 Not tainted [ 52.746697] -------------------------------------------- [ 52.752635] kworker/1:3/599 is trying to acquire lock: [ 52.758378] (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c4fa4>] flush_work+0x3a4/0x5e0 [ 52.767837] but task is already holding lock: [ 52.774360] (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0 [ 52.784495] other info that might help us debug this: [ 52.791794] Possible unsafe locking scenario: [ 52.798413] CPU0 [ 52.801144] ---- [ 52.803875] lock(mlxsw_core_driver_name); [ 52.808556] lock(mlxsw_core_driver_name); [ 52.813236] * DEADLOCK * [ 52.819857] May be due to missing lock nesting notation [ 52.827450] 3 locks held by kworker/1:3/599: [ 52.832221] #0: (mlxsw_core_driver_name){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0 [ 52.842846] #1: ((&(&bridge->fdb_notify.dw)->work)){+.+.}, at: [<ffffffff811c65c4>] process_one_work+0x7d4/0x12f0 [ 52.854537] #2: (rtnl_mutex){+.+.}, at: [<ffffffff822ad8e7>] rtnl_lock+0x17/0x20 [ 52.863021] stack backtrace: [ 52.867890] CPU: 1 PID: 599 Comm: kworker/1:3 Not tainted 4.14.0-rc3jiri+ #4 [ 52.875773] Hardware name: Mellanox Technologies Ltd. "MSN2100-CB2F"/"SA001017", BIOS 5.6.5 06/07/2016 [ 52.886267] Workqueue: mlxsw_core mlxsw_sp_fdb_notify_work [mlxsw_spectrum] [ 52.894060] Call Trace: [ 52.909122] __lock_acquire+0xf6f/0x2a10 [ 53.025412] lock_acquire+0x158/0x440 [ 53.047557] flush_work+0x3c4/0x5e0 [ 53.087571] __cancel_work_timer+0x3ca/0x5e0 [ 53.177051] cancel_delayed_work_sync+0x13/0x20 [ 53.182142] mlxsw_reg_trans_bulk_wait+0x12d/0x7a0 [mlxsw_core] [ 53.194571] mlxsw_core_reg_access+0x586/0x990 [mlxsw_core] [ 53.225365] mlxsw_reg_query+0x10/0x20 [mlxsw_core] [ 53.230882] mlxsw_sp_fdb_notify_work+0x2a3/0x9d0 [mlxsw_spectrum] [ 53.237801] process_one_work+0x8f1/0x12f0 [ 53.321804] worker_thread+0x1fd/0x10c0 [ 53.435158] kthread+0x28e/0x370 [ 53.448703] ret_from_fork+0x2a/0x40 [ 53.453017] mlxsw_spectrum 0000:01:00.0: EMAD retries (2/5) (tid=bf4549b100000774) [ 53.453119] mlxsw_spectrum 0000:01:00.0: EMAD retries (5/5) (tid=bf4549b100000770) [ 53.453132] mlxsw_spectrum 0000:01:00.0: EMAD reg access failed (tid=bf4549b100000770,reg_id=200b(sfn),type=query,status=0(operation performed)) [ 53.453143] mlxsw_spectrum 0000:01:00.0: Failed to get FDB notifications Fix this by creating another workqueue for EMAD timeouts, thereby preventing the situation of a work item trying to flush a work item queued on the same workqueue. Fixes: `caf7297e7a` ("mlxsw: core: Introduce support for asynchronous EMAD register access") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-18 12:19:15 +01:00
David S. Miller	1b72bf5a07	Just a single fix, for a WoWLAN-related part of CVE-2017-13080. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEExu3sM/nZ1eRSfR9Ha3t4Rpy0AB0FAlnkt7YACgkQa3t4Rpy0 AB3UZg/8CsZr3SeDM1CxLTmDxJ734qq2kcoBo8nJgJqSwIWE5goGXCJSIGQw0o9O X8IF/xKj+Vgqfmxs73dnZM15klkVc9HBOdMdnjayWZZa5h6wKFFVNdjkMsrFxdV8 iugcCaUYCg1/Sc/qq4hBWLe5k3mQa5fB1clr7qykYEfIuY2ORpEJ+P2Uqjb5/RKx ugLO1qdCVMk7kqB/Fc8XKrPdiCgmfaMdz9lTqm1yXDThQe0rNxmWR339YpleqEiA lWIOZ53OlHZ3LTz8YDPcNA7mf5KtK1qQc60kZLygDcy5swvKaSIUaLfYOIjsPLIi 5tJ4tztEPpi5RTMPHkOLCcdGCT2bQ9InCmgSmEwymMbn36vaLuyH+9pfkqgbunUq M1C07VNG9HUi+Qz6mhtbY6ZyG0s1M09rAouCOGo4T9x7YHA73BetvcxYcxhdTm6t vOTDd/xiWM1Dsgc2tu41A+jB3cShi5so/OLE1KcLe4r1cso6+FdR+p5VW6XyRV73 Kbdh39azsdlyK6L+BW6cWCm/DUstourVmOz7mPebE0NW7N8cEhTtvclu0dz1gdKL nBMha+lOBmy2+acU5Z58f6/jG8RCMcyiD+0z9yrZmOmO4OgVyxHXP2GrsQ1d/Qr5 vhBLT1LFxcvJpIlKZy3dqv1lZQ10suwcsFzJG3NsQbczV7Vk9aM= =Iwyf -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2017-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Just a single fix, for a WoWLAN-related part of CVE-2017-13080. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 21:27:16 +01:00
Xin Long	823038ca03	dev_ioctl: add missing NETDEV_CHANGE_TX_QUEUE_LEN event notification When changing dev tx_queue_len via netlink or net-sysfs, a NETDEV_CHANGE_TX_QUEUE_LEN event notification will be called. But dev_ioctl missed this event notification, which could cause no userspace notification would be sent. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 21:23:10 +01:00
Or Gerlitz	c019b5166e	net/sched: cls_flower: Set egress_dev mark when calling into the HW driver Commit `7091d8c` '(net/sched: cls_flower: Add offload support using egress Hardware device') made sure (when fl_hw_replace_filter is called) to put the egress_dev mark on persisent structure instance. Hence, following calls into the HW driver for stats and deletion will note it and act accordingly. With commit `de4784ca03` this property is lost and hence when called, the HW driver failes to operate (stats, delete) on the offloaded flow. Fix it by setting the egress_dev flag whenever the ingress device is different from the hw device since this is exactly the condition under which we're calling into the HW driver through the egress port net-device. Fixes: `de4784ca03` ('net: sched: get rid of struct tc_to_netdev') Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roi Dayan <roid@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 21:18:59 +01:00
Cong Wang	0ad646c81b	tun: call dev_get_valid_name() before register_netdevice() register_netdevice() could fail early when we have an invalid dev name, in which case ->ndo_uninit() is not called. For tun device, this is a problem because a timer etc. are already initialized and it expects ->ndo_uninit() to clean them up. We could move these initializations into a ->ndo_init() so that register_netdevice() knows better, however this is still complicated due to the logic in tun_detach(). Therefore, I choose to just call dev_get_valid_name() before register_netdevice(), which is quicker and much easier to audit. And for this specific case, it is already enough. Fixes: `96442e4242` ("tuntap: choose the txq based on rxq") Reported-by: Dmitry Alexeev <avekceeb@gmail.com> Cc: Jason Wang <jasowang@redhat.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 21:02:54 +01:00
Nicolas Dichtel	2459b4c635	net: enable interface alias removal via rtnl IFLA_IFALIAS is defined as NLA_STRING. It means that the minimal length of the attribute is 1 ("\0"). However, to remove an alias, the attribute length must be 0 (see dev_set_alias()). Let's define the type to NLA_BINARY to allow 0-length string, so that the alias can be removed. Example: $ ip l s dummy0 alias foo $ ip l l dev dummy0 5: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:20:30:4f:a7:f3 brd ff:ff:ff:ff:ff:ff alias foo Before the patch: $ ip l s dummy0 alias "" RTNETLINK answers: Numerical result out of range After the patch: $ ip l s dummy0 alias "" $ ip l l dev dummy0 5: dummy0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether ae:20:30:4f:a7:f3 brd ff:ff:ff:ff:ff:ff CC: Oliver Hartkopp <oliver@hartkopp.net> CC: Stephen Hemminger <stephen@networkplumber.org> Fixes: `96ca4a2cc1` ("net: remove ifalias on empty given alias") Reported-by: Julien FLoret <julien.floret@6wind.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 20:52:43 +01:00
David S. Miller	2fd7c5abb8	Merge branch 'rtnetlink-dev-notification-fixes' Xin Long says: ==================== rtnetlink: a bunch of fixes for userspace notifications in changing dev properties Whenever any property of a link, address, route, etc. changes by whatever way, kernel should notify the programs that listen for such events in userspace. The patchet "rtnetlink: Cleanup user notifications for netdev events" tried to fix a redundant notifications issue, but it also introduced a side effect. After that, user notifications could only be sent when changing dev properties via netlink api. As it removed some events process in rtnetlink_event where the notifications was sent to users. It resulted in no notification generated when dev properties are changed via other ways, like ioctl, sysfs, etc. It may cause some user programs doesn't work as expected because of the missing notifications. This patchset will fix it by bringing some of these netdev events back and also fix the old redundant notifications issue with a proper way. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 20:48:45 +01:00
Xin Long	2d7f669b42	rtnetlink: do not set notification for tx_queue_len in do_setlink NETDEV_CHANGE_TX_QUEUE_LEN event process in rtnetlink_event would send a notification for userspace and tx_queue_len's setting in do_setlink would trigger NETDEV_CHANGE_TX_QUEUE_LEN. So it shouldn't set DO_SETLINK_NOTIFY status for this change to send a notification any more. Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 20:48:45 +01:00
Xin Long	64ff90cc2e	rtnetlink: check DO_SETLINK_NOTIFY correctly in do_setlink The check 'status & DO_SETLINK_NOTIFY' in do_setlink doesn't really work after status & DO_SETLINK_MODIFIED, as: DO_SETLINK_MODIFIED 0x1 DO_SETLINK_NOTIFY 0x3 Considering that notifications are suppposed to be sent only when status have the flag DO_SETLINK_NOTIFY, the right check would be: (status & DO_SETLINK_NOTIFY) == DO_SETLINK_NOTIFY This would avoid lots of duplicated notifications when setting some properties of a link. Fixes: `ba9989069f` ("rtnl/do_setlink(): notify when a netdev is modified") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 20:48:45 +01:00
Xin Long	dc709f3757	rtnetlink: bring NETDEV_CHANGEUPPER event process back in rtnetlink_event libteam needs this event notification in userspace when dev's master dev has been changed. After this, the redundant notifications issue would be fixed in the later patch 'rtnetlink: check DO_SETLINK_NOTIFY correctly in do_setlink'. Fixes: `b6b36eb23a` ("rtnetlink: Do not generate notifications for NETDEV_CHANGEUPPER event") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 20:48:45 +01:00
Xin Long	e6e6659446	rtnetlink: bring NETDEV_POST_TYPE_CHANGE event process back in rtnetlink_event As I said in patch 'rtnetlink: bring NETDEV_CHANGEMTU event process back in rtnetlink_event', removing NETDEV_POST_TYPE_CHANGE event was not the right fix for the redundant notifications issue. So bring this event process back to rtnetlink_event and the old redundant notifications issue would be fixed in the later patch 'rtnetlink: check DO_SETLINK_NOTIFY correctly in do_setlink'. Fixes: `aef091ae58` ("rtnetlink: Do not generate notifications for POST_TYPE_CHANGE event") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 20:48:45 +01:00
Xin Long	ebdcf0450b	rtnetlink: bring NETDEV_CHANGE_TX_QUEUE_LEN event process back in rtnetlink_event The same fix for changing mtu in the patch 'rtnetlink: bring NETDEV_CHANGEMTU event process back in rtnetlink_event' is needed for changing tx_queue_len. Note that the redundant notifications issue for tx_queue_len will be fixed in the later patch 'rtnetlink: do not send notification for tx_queue_len in do_setlink'. Fixes: `27b3b551d8` ("rtnetlink: Do not generate notifications for NETDEV_CHANGE_TX_QUEUE_LEN event") Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 20:48:44 +01:00
Xin Long	8a212589fe	rtnetlink: bring NETDEV_CHANGEMTU event process back in rtnetlink_event Commit `085e1a65f0` ("rtnetlink: Do not generate notifications for MTU events") tried to fix the redundant notifications issue when ip link set mtu by removing NETDEV_CHANGEMTU event process in rtnetlink_event. But it also resulted in no notification generated when dev's mtu is changed via other methods, like: 'ifconfig eth1 mtu 1400' or 'echo 1400 > /sys/class/net/eth1/mtu' It would cause users not to be notified by this change. This patch is to fix it by bringing NETDEV_CHANGEMTU event back into rtnetlink_event, and the redundant notifications issue will be fixed in the later patch 'rtnetlink: check DO_SETLINK_NOTIFY correctly in do_setlink'. Fixes: `085e1a65f0` ("rtnetlink: Do not generate notifications for MTU events") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-16 20:48:44 +01:00
Johannes Berg	fdf7cb4185	mac80211: accept key reinstall without changing anything When a key is reinstalled we can reset the replay counters etc. which can lead to nonce reuse and/or replay detection being impossible, breaking security properties, as described in the "KRACK attacks". In particular, CVE-2017-13080 applies to GTK rekeying that happened in firmware while the host is in D3, with the second part of the attack being done after the host wakes up. In this case, the wpa_supplicant mitigation isn't sufficient since wpa_supplicant doesn't know the GTK material. In case this happens, simply silently accept the new key coming from userspace but don't take any action on it since it's the same key; this keeps the PN replay counters intact. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-10-16 13:02:03 +02:00
David S. Miller	8f718fa6bb	Merge branch 'bnxt_en-fixes' Michael Chan says: ==================== bnxt_en: bug fixes. Various bug fixes for the VF/PF link change logic, VF resource checking, potential firmware response corruption on NVRAM and DCB parameters, and reading the wrong register for PCIe link speed on the VF. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:51:52 -07:00
Sankar Patchineelam	5b1e1a9ce0	bnxt_en: Fix possible corruption in DCB parameters from firmware. hwrm_send_message() is replaced with _hwrm_send_message(), and hwrm_cmd_lock mutex lock is grabbed for the whole period of firmware call until the firmware DCB parameters have been copied. This will prevent possible corruption of the firmware data. Fixes: `7df4ae9fe8` ("bnxt_en: Implement DCBNL to support host-based DCBX.") Signed-off-by: Sankar Patchineelam <sankar.patchineelam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:51:51 -07:00
Michael Chan	cc72f3b1fe	bnxt_en: Fix possible corrupted NVRAM parameters from firmware response. In bnxt_find_nvram_item(), it is copying firmware response data after releasing the mutex. This can cause the firmware response data to be corrupted if the next firmware response overwrites the response buffer. The rare problem shows up when running ethtool -i repeatedly. Fix it by calling the new variant _hwrm_send_message_silent() that requires the caller to take the mutex and to release it after the response data has been copied. Fixes: `3ebf6f0a09` ("bnxt_en: Add installed-package version reporting via Ethtool GDRVINFO") Reported-by: Sarveswara Rao Mygapula <sarveswararao.mygapula@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:51:51 -07:00
Michael Chan	021570793d	bnxt_en: Fix VF resource checking. In bnxt_sriov_enable(), we calculate to see if we have enough hardware resources to enable the requested number of VFs. The logic to check for minimum completion rings and statistics contexts is missing. Add the required checks so that VF configuration won't fail. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:51:51 -07:00
Vasundhara Volam	7ab0760f51	bnxt_en: Fix VF PCIe link speed and width logic. PCIE PCIE_EP_REG_LINK_STATUS_CONTROL register is only defined in PF config space, so we must read it from the PF. Fixes: `90c4f788f6` ("bnxt_en: Report PCIe link speed and width during driver load") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:51:51 -07:00
Michael Chan	e2dc9b6e38	bnxt_en: Don't use rtnl lock to protect link change logic in workqueue. As a further improvement to the PF/VF link change logic, use a private mutex instead of the rtnl lock to protect link change logic. With the new mutex, we don't have to take the rtnl lock in the workqueue when we have to handle link related functions. If the VF and PF drivers are running on the same host and both take the rtnl lock and one is waiting for the other, it will cause timeout. This patch fixes these timeouts. Fixes: `90c694bb71` ("bnxt_en: Fix RTNL lock usage on bnxt_update_link().") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:51:51 -07:00
Michael Chan	c213eae8d3	bnxt_en: Improve VF/PF link change logic. Link status query firmware messages originating from the VFs are forwarded to the PF. The driver handles these interactions in a workqueue for the VF and PF. The VF driver waits for the response from the PF in the workqueue. If the PF and VF driver are running on the same host and the work for both PF and VF are queued on the same workqueue, the VF driver may not get the response if the PF work item is queued behind it on the same workqueue. This will lead to the VF link query message timing out. To prevent this, we create a private workqueue for PFs instead of using the common workqueue. The VF query and PF response will never be on the same workqueue. Fixes: `c0c050c58d` ("bnxt_en: New Broadcom ethernet driver.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:51:51 -07:00
Vivien Didelot	3efc93c2bc	net: dsa: mv88e6060: fix switch MAC address The 88E6060 Ethernet switch always transmits the multicast bit of the switch MAC address as a zero. It re-uses the corresponding bit 8 of the register "Switch MAC Address Register Bytes 0 & 1" for "DiffAddr". If the "DiffAddr" bit is 0, then all ports transmit the same source address. If it is set to 1, then bit 2:0 are used for the port number. The mv88e6060 driver is currently wrongly shifting the MAC address byte 0 by 9. To fix this, shift it by 8 as usual and clear its bit 0. Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com> Reviewed-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:40:03 -07:00
Guillaume Nault	5903f59493	l2tp: check ps->sock before running pppol2tp_session_ioctl() When pppol2tp_session_ioctl() is called by pppol2tp_tunnel_ioctl(), the session may be unconnected. That is, it was created by pppol2tp_session_create() and hasn't been connected with pppol2tp_connect(). In this case, ps->sock is NULL, so we need to check for this case in order to avoid dereferencing a NULL pointer. Fixes: `309795f4be` ("l2tp: Add netlink control API for L2TP") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:38:56 -07:00
Wenhua Shi	09001b03f7	net: fix typo in skbuff.c Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-14 18:23:43 -07:00
Emiliano Ingrassia	b984986067	net: stmmac: dwmac_lib: fix interchanged sleep/timeout values in DMA reset function The DMA reset timeout, used in read_poll_timeout, is ten times shorter than the sleep time. This patch fixes these values interchanging them, as it was before the read_poll_timeout introduction. Fixes: `8a70aeca80` ("net: stmmac: Use readl_poll_timeout") Signed-off-by: Emiliano Ingrassia <ingrassia@epigenesys.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-13 10:19:52 -07:00
Arnd Bergmann	e7ad97938e	liquidio: fix timespec64_to_ns typo While experimenting with changes to the timekeeping code, I ran into a build error in the liquidio driver: drivers/net/ethernet/cavium/liquidio/lio_main.c: In function 'liquidio_ptp_settime': drivers/net/ethernet/cavium/liquidio/lio_main.c:1850:22: error: passing argument 1 of 'timespec_to_ns' from incompatible pointer type [-Werror=incompatible-pointer-types] The driver had a type mismatch since it was first merged, but this never caused problems because it is only built on 64-bit architectures that define timespec and timespec64 to the same type. If we ever want to compile-test the driver on 32-bit or change the way that 64-bit timespec64 is defined, we need to fix it, so let's just do it now. Fixes: `f21fb3ed36` ("Add support of Cavium Liquidio ethernet adapters") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Felix Manlunas <felix.manlunas@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-13 10:18:38 -07:00
David S. Miller	db5972c95d	wireless-drivers fixes for 4.14 Nothing really special standing out, all of these are important fixes which should go to 4.14. iwlwifi * fix support for 3168 device series * fix a potential crash when using FW debugging recording; * improve channel flags parsing to avoid warnings on too long traces * return -ENODATA when the temperature is not available, since the -EIO we were returning was causing fatal errors in userspace * avoid printing too many messages in dmesg when using monitor mode, since this can become very noisy and completely flood the logs brcmsmac * reduce stack usage to avoid frame size warnings with KASAN brcmfmac * add a check to avoid copying uninitialised memory rtlwifi: * fix a regression with rtl8821ae starting from v4.11 where connections was frequently lost -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJZ4GYHAAoJEG4XJFUm622bXf8H/3DJF5P02OzWIVMK376xUgVI feX8tsazK0/cCgxeMfp/K4Nb9umZHFTNd1Zs5mrx8DFJjL1ekXTz93GtfcJ+WFH5 sPwS+IVOf4IaBrSiNpqskJhln3GHp7A+eFgqbFQ+Is4/eQQBqt/lW6LZoyA5DI6X 80hOT4eZtHanKNFQHuSgOGHXttAB9AcnEJngvfToZCV1jqadeoPK1GZQ+6Xsot3g RU5gYxc4gN+d203KhUxEebzz8buyfCEPG1wFze/mazPcQJ+8IJjMNOl474atKslW dsbqak2gtl70weZHeh2PlYkNvTZbAbiWk9feaTGeuwAgwWDms1BJKUKRppfZdqE= =RA6K -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-for-davem-2017-10-13' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.14 Nothing really special standing out, all of these are important fixes which should go to 4.14. iwlwifi * fix support for 3168 device series * fix a potential crash when using FW debugging recording; * improve channel flags parsing to avoid warnings on too long traces * return -ENODATA when the temperature is not available, since the -EIO we were returning was causing fatal errors in userspace * avoid printing too many messages in dmesg when using monitor mode, since this can become very noisy and completely flood the logs brcmsmac * reduce stack usage to avoid frame size warnings with KASAN brcmfmac * add a check to avoid copying uninitialised memory rtlwifi: * fix a regression with rtl8821ae starting from v4.11 where connections was frequently lost ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-13 08:42:53 -07:00
Stephen Hemminger	12ed3772b7	ip: update policy routing config help The kernel config help for policy routing was still pointing at an ancient document from 2000 that refers to Linux 2.1. Update it to point to something that is at least occasionally updated. Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-12 22:57:11 -07:00
Samuel Mendoza-Jonas	6e9c007540	net/ncsi: Don't limit vids based on hot_channel Currently we drop any new VLAN ids if there are more than the current (or last used) channel can support. Most importantly this is a problem if no channel has been selected yet, resulting in a segfault. Secondly this does not necessarily reflect the capabilities of any other channels. Instead only drop a new VLAN id if we are already tracking the maximum allowed by the NCSI specification. Per-channel limits are already handled by ncsi_add_filter(), but add a message to set_one_vid() to make it obvious that the channel can not support any more VLAN ids. Signed-off-by: Samuel Mendoza-Jonas <sam@mendozajonas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-11 20:10:37 -07:00
Daniel Drake	bde135a672	r8169: only enable PCI wakeups when WOL is active rtl_init_one() currently enables PCI wakeups if the ethernet device is found to be WOL-capable. There is no need to do this when rtl8169_set_wol() will correctly enable or disable the same wakeup flag when WOL is activated/deactivated. This works around an ACPI DSDT bug which prevents the Acer laptop models Aspire ES1-533, Aspire ES1-732, PackardBell ENTE69AP and Gateway NE533 from entering S3 suspend - even when no ethernet cable is connected. On these platforms, the DSDT says that GPE08 is a wakeup source for ethernet, but this GPE fires as soon as the system goes into suspend, waking the system up immediately. Having the wakeup normally disabled avoids this issue in the default case. With this change, WOL will continue to be unusable on these platforms (it will instantly wake up if WOL is later enabled by the user) but we do not expect this to be a commonly used feature on these consumer laptops. We have separately determined that WOL works fine without any ACPI GPEs enabled during sleep, so a DSDT fix or override would be possible to make WOL work. Signed-off-by: Daniel Drake <drake@endlessm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-11 15:26:17 -07:00
Sabrina Dubroca	5aba2ba503	macsec: fix memory leaks when skb_to_sgvec fails Fixes: `cda7ea6903` ("macsec: check return value of skb_to_sgvec always") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-11 14:07:20 -07:00
Eric Dumazet	c0576e3975	net: call cgroup_sk_alloc() earlier in sk_clone_lock() If for some reason, the newly allocated child need to be freed, we will call cgroup_put() (via sk_free_unlock_clone()) while the corresponding cgroup_get() was not yet done, and we will free memory too soon. Fixes: `d979a39d72` ("cgroup: duplicate cgroup reference when cloning sockets") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-10 20:24:29 -07:00
Eric Dumazet	75cb070960	Revert "net: defer call to cgroup_sk_alloc()" This reverts commit `fbb1fb4ad4`. This was not the proper fix, lets cleanly revert it, so that following patch can be carried to stable versions. sock_cgroup_ptr() callers do not expect a NULL return value. Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-10 20:24:29 -07:00
David S. Miller	befc7a3324	Merge branch 'nfp-fix-ethtool-stats-and-page-allocation' Jakub Kicinski says: ==================== nfp: fix ethtool stats and page allocation Two fixes for net. First one makes sure we handle gather of stats on 32bit machines correctly (ouch). The second fix solves a potential NULL-deref if we fail to allocate a page with XDP running. I used Fixes: tags pointing to where the bug was introduced, but for patch 1 it has been in the driver "for ever" and fix won't backport cleanly beyond commit `325945ede6` ("nfp: split software and hardware vNIC statistics") which is in net. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-10 13:18:34 -07:00
Jakub Kicinski	5f0ca2fb71	nfp: handle page allocation failures page_address() does not handle NULL argument gracefully, make sure we NULL-check the page pointer before passing it to page_address(). Fixes: `ecd63a0217` ("nfp: add XDP support in the driver") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-10 13:18:33 -07:00
Jakub Kicinski	c3d64ad4fe	nfp: fix ethtool stats gather retry The while loop fetching 64 bit ethtool statistics may have to retry multiple times, it shouldn't modify the outside state. Fixes: `4c3523623d` ("net: add driver for Netronome NFP4000/NFP6000 NIC VFs") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-10 13:18:33 -07:00
David S. Miller	3668bb8da1	Merge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-queue Jeff Kirsher says: ==================== Intel Wired LAN Driver Updates 2017-10-10 This series contains updates to i40e only. Stefano Brivio fixes the grammar in a function header comment. Alex fixes a memory leak where we were not correctly placing the pages from buffers that had been used to return a filter programming status back on the ring. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-10 13:17:11 -07:00
Behan Webster	365ff9df56	wimax/i2400m: Remove VLAIS Convert Variable Length Array in Struct (VLAIS) to valid C by converting local struct definition to use a flexible array. The structure is only used to define a cast of a buffer so the size of the struct is not used to allocate storage. Signed-off-by: Behan Webster <behanw@converseincode.com> Signed-off-by: Mark Charebois <charlebm@gmail.com> Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-10 12:35:05 -07:00
Alexander Duyck	2b9478ffc5	i40e: Fix memory leak related filter programming status It looks like we weren't correctly placing the pages from buffers that had been used to return a filter programming status back on the ring. As a result they were being overwritten and tracking of the pages was lost. This change works to correct that by incorporating part of i40e_put_rx_buffer into the programming status handler code. As a result we should now be correctly placing the pages for those buffers on the re-allocation list instead of letting them stay in place. Fixes: `0e626ff7cc` ("i40e: Fix support for flow director programming status") Reported-by: Anders K. Pedersen <akp@cohaesio.com> Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Anders K Pedersen <akp@cohaesio.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-10 08:04:36 -07:00
Stefano Brivio	e836e32112	i40e: Fix comment about locking for __i40e_read_nvm_word() Caller needs to acquire the lock. Called functions will not. Fixes: `09f79fd49d` ("i40e: avoid NVM acquire deadlock during NVM update") Signed-off-by: Stefano Brivio <sbrivio@redhat.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2017-10-10 07:54:35 -07:00
Eric Dumazet	fbb1fb4ad4	net: defer call to cgroup_sk_alloc() sk_clone_lock() might run while TCP/DCCP listener already vanished. In order to prevent use after free, it is better to defer cgroup_sk_alloc() to the point we know both parent and child exist, and from process context. Fixes: `e994b2f0fb` ("tcp: do not lock listener to process SYN packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 20:55:01 -07:00
Eric Dumazet	9f1c2674b3	net: memcontrol: defer call to mem_cgroup_sk_alloc() Instead of calling mem_cgroup_sk_alloc() from BH context, it is better to call it from inet_csk_accept() in process context. Not only this removes code in mem_cgroup_sk_alloc(), but it also fixes a bug since listener might have been dismantled and css_get() might cause a use-after-free. Fixes: `e994b2f0fb` ("tcp: do not lock listener to process SYN packets") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-10-09 20:55:01 -07:00

1 2 3 4 5 ...

707189 Commits