linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-17 14:36:46 +07:00

Author	SHA1	Message	Date
Nogah Frankel	23432cb86a	mlxsw: resources: Add max trap groups resource Add the max number of trap groups to resource query. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	9d87fceac6	mlxsw: core: Change emad trap group settings Currently, the emad trap init was done in the core. In the future we will want to add some changes to the traps groups, according to device type. This commit create a driver function to create the trap group for the emad, so later it can be changed by devices. It also changes the emad registration to use the new generic functions. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	0fb78a4e9c	mlxsw: Add option to choose trap group Currently, we set the trap group to pre-determined option, based on whether it is an rx or event trap. This commit adds a possibility to chose the trap group, so it can be set to different values in the following patches. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	d570b7ee4e	mlxsw: Change trap set function Change trap setting function so instead of determining the trap group by trap id, it gets it as a parameter (so later we can have different trap groups for Spectrum and Switchx2). Add "is_ctrl" parameter to the trap setting function. It control whether the trapped packets wait in a designated control buffer or in their default one. This parameter is ignored by Switchx2 and Switchib. Add these parameters to the traps array in Spectrum, Switchx2 and Switchib. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	85d5c9cd90	mlxsw: switchib: Use generic listener struct for events Change the event handling in Switchib to be comptible with Spectrum and Switchx2. Use the generic listener struct for the events. Init and fini them by loop (and not by calling each event by its name). Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	6bf08b53ed	mlxsw: switchx2: Use generic listener struct for events Change the events to use the generic listener struct. Merge the event list into the trap list, so the same functions will handle both. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	4544913ed7	mlxsw: spectrum: Use generic listener struct for events Change the events to use the generic listener struct. Merge the event list into the trap list, so the same functions will handle both. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	fb9012d93f	mlxsw: core: Introduce generic macro for event Create a macro for creating the generic listener struct for events, similar to the one for rx traps. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	2332d8c7df	mlxsw: switchx2: Use generic listener struct for rx traps Reorganize the traps to use the new generic listener struct and functions. Use macros to shorten the traps list. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	14eeda99c4	mlxsw: spectrum: Use generic listener struct for rx traps Replace the old rx listener struct definitions by the generic ones. Use the new generic registering / unregistering functions for them. Add some macros to organize the trap list. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	b63da93de8	mlxsw: core: Expose generic macros for rx trap In Spectrum, there is a macro to arrange the traps list. This macro is useful for everyone who is using rx traps. Create a similar macro in core.h for creating the generic listener struct for rx traps. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	0791051c43	mlxsw: core: Create a generic function to register / unregister traps We have 2 types of HW traps to handle, rx traps and events. The registration workflow for both is very similar. So it only make sense to create one function to handle both. This patch creates a struct to hold the data for both cases. It also creates a registration and an un-registration functions that get this generic struct as input. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Nogah Frankel	ee4a60d898	mlxsw: spectrum: Remove unused traps Since commit `99724c18fc` ("mlxsw: spectrum: Introduce support for router interfaces") we no longer rely on flooding traffic to the CPU in order to trap packets intended for the host itself. Therefore, the FDB MC trap can be removed. Remove traps for protocols that are not supported yet. Signed-off-by: Nogah Frankel <nogahf@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:22:14 -05:00
Arnd Bergmann	e8f967c3d8	mvpp2: use correct size for memset gcc-7 detects a short memset in mvpp2, introduced in the original merge of the driver: drivers/net/ethernet/marvell/mvpp2.c: In function 'mvpp2_cls_init': drivers/net/ethernet/marvell/mvpp2.c:3296:2: error: 'memset' used with length equal to number of elements without multiplication by element size [-Werror=memset-elt-size] The result seems to be that we write uninitialized data into the flow table registers, although we did not get any warning about that uninitialized data usage. Using sizeof() lets us initialize then entire array instead. Fixes: `3f518509de` ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:57:21 -05:00
Geliang Tang	5e7dfeb758	net/mlx5: drop duplicate header delay.h Drop duplicate header delay.h from mlx5/core/main.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Acked-by: Matan Barak <matanb@mellanox.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:33:10 -05:00
Geliang Tang	8f8a8b13b4	net: ieee802154: drop duplicate header delay.h Drop duplicate header delay.h from adf7242.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Acked-by: Stefan Schmidt <stefan@osg.samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:33:10 -05:00
Geliang Tang	4ee12efa2d	ibmvnic: drop duplicate header seq_file.h Drop duplicate header seq_file.h from ibmvnic.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:32:10 -05:00
Dan Carpenter	1f1e70efe5	fsl/fman: fix a leak in tgec_free() We set "tgec->cfg" to NULL before passing it to kfree(). There is no need to set it to NULL at all. Let's just delete it. Fixes: `57ba4c9b56` ("fsl/fman: Add FMan MAC support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:29:48 -05:00
Dan Carpenter	eafa6abd99	net/mlx5: remove a duplicate condition We verified that MLX5_FLOW_CONTEXT_ACTION_COUNT was set on the first line of the function so we don't need to check again here. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:28:28 -05:00
Miroslav Lichvar	8006f6bf5e	net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS The ETHTOOL_GLINKSETTINGS command is deprecating the ETHTOOL_GSET command and likewise it shouldn't require the CAP_NET_ADMIN capability. Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:23:30 -05:00
David S. Miller	0ebc5b62a3	Merge branch 'thunderx-new-features' Sunil Goutham says: ==================== net: thunderx: Support for 80xx, RED, PFC e.t.c This patch series adds support for SLM modules present on 80xx silicon, enables ramdom early discard, backpressure generation, PFC and some ethtool changes to display supported link modes e.t.c. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:21:24 -05:00
Sunil Goutham	430da20808	net: thunderx: Pause frame support Enable pause frames on both Rx and Tx side, configure pause interval e.t.c. Also support for enable/disable pause frames on Rx/Tx via ethtool has been added. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:21:17 -05:00
Sunil Goutham	d5b2d7a718	net: thunderx: Configure RED and backpressure levels This patch enables moving average calculation of Rx pkt's resources and configures RED and backpressure levels for both CQ and RBDR. Also initialize SQ's CQ_LIMIT properly. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:21:17 -05:00
Thanneeru Srinivasulu	1cc702591b	net: thunderx: Add ethtool support for supported ports and link modes. Signed-off-by: Thanneeru Srinivasulu <tsrinivasulu@cavium.com> Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:21:17 -05:00
Sunil Goutham	5271156b1a	net: thunderx: 80xx BGX0 configuration changes On 80xx only one lane of DLM0 and DLM1 (of BGX0) can be used , so even though lmac count may be 2 but LMAC1 should use serdes lane of DLM1. Since it's not possible to distinguish 80xx from 81xx as PCI devid are same, this patch adds this config support by replying on what firmware configures the lmacs with. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:21:17 -05:00
Jon Paul Maloy	d876a4d2af	tipc: improve sanity check for received domain records In commit `35c55c9877` ("tipc: add neighbor monitoring framework") we added a data area to the link monitor STATE messages under the assumption that previous versions did not use any such data area. For versions older than Linux 4.3 this assumption is not correct. In those version, all STATE messages sent out from a node inadvertently contain a 16 byte data area containing a string; -a leftover from previous RESET messages which were using this during the setup phase. This string serves no purpose in STATE messages, and should no be there. Unfortunately, this data area is delivered to the link monitor framework, where a sanity check catches that it is not a correct domain record, and drops it. It also issues a rate limited warning about the event. Since such events occur much more frequently than anticipated, we now choose to remove the warning in order to not fill the kernel log with useless contents. We also make the sanity check stricter, to further reduce the risk that such data is inavertently admitted. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:06:18 -05:00
Jon Paul Maloy	f79675563a	tipc: fix compatibility bug in link monitoring commit `817298102b` ("tipc: fix link priority propagation") introduced a compatibility problem between TIPC versions newer than Linux 4.6 and those older than Linux 4.4. In versions later than 4.4, link STATE messages only contain a non-zero link priority value when the sender wants the receiver to change its priority. This has the effect that the receiver resets itself in order to apply the new priority. This works well, and is consistent with the said commit. However, in versions older than 4.4 a valid link priority is present in all sent link STATE messages, leading to cyclic link establishment and reset on the 4.6+ node. We fix this by adding a test that the received value should not only be valid, but also differ from the current value in order to cause the receiving link endpoint to reset. Reported-by: Amar Nv <amar.nv005@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:06:18 -05:00
Woojung Huh	a7dac9f9c1	phy: fix error case of phy_led_triggers_(un)register When phy_init_hw() fails at phy_attach_direct(); - phy_detach() calls phy_led_triggers_unregister() without previous call of phy_led_triggers_register(). - still call phy_led_triggers_register() and cause memory leak. Fixes: `2e0bc452f4` ("net: phy: leds: add support for led triggers on phy link state change") Signed-off-by: Woojung Huh <woojung.huh@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 19:57:57 -05:00
Andrew Lunn	97db8afa2a	net: ethernet: mvneta: Remove IFF_UNICAST_FLT which is not implemented The mvneta driver advertises it supports IFF_UNICAST_FLT. However, it actually does not. The hardware probably does support it, but there is no code to configure the filter. As a quick and simple fix, remove the flag. This will cause the core to fall back to promiscuous mode. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Fixes: `b50b72de2f` ("net: mvneta: enable features before registering the driver") Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 19:56:37 -05:00
Linus Torvalds	3ad0e83cf8	Merge branch 'parisc-4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux Pull parisc fixes from Helge Deller: "On parisc we were still seeing occasional random segmentation faults and memory corruption on SMP machines. Dave Anglin then looked again at the TLB related code and found two issues in the PCI DMA and generic TLB flush functions. Then, in our startup code we had some timing of the cache and TLB functions to calculate a threshold when to use a complete TLB/cache flush or just to flush a specific range. This code produced a race with newly started CPUs and thus lead to occasional kernel crashes (due to stale TLB/cache entries). The patch by Dave fixes this issue by flushing the local caches before starting secondary CPUs and by removing the race. The last problem fixed by this series is that we quite often suffered from hung tasks and self-detected stalls on the CPUs. It was somehow clear that this was related to the (in v4.7) newly introduced cr16 clocksource and the own implementation of sched_clock(). I replaced the open-coded sched_clock() function and switched to the generic sched_clock() implementation which seems to have fixed this isse as well. All patches have been sucessfully tested on a variety of machines, including our debian buildd servers. All patches (beside the small pr_cont fix) are tagged for stable releases" * 'parisc-4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux: parisc: Also flush data TLB in flush_icache_page_asm parisc: Fix race in pci-dma.c parisc: Switch to generic sched_clock implementation parisc: Fix races in parisc_setup_cache_timing() parisc: Fix printk continuations in system detection	2016-11-25 16:47:15 -08:00
Eric Dumazet	f52dffe049	net: properly flush delay-freed skbs Typical NAPI drivers use napi_consume_skb(skb) at TX completion time. This put skb in a percpu special queue, napi_alloc_cache, to get bulk frees. It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE limit quite often, with skbs that were queued hundreds of usec earlier. I measured this can take ~6000 nsec to perform one flush. __kfree_skb_flush() can be called from two points right now : 1) From net_tx_action(), but only for skbs that were queued to sd->completion_queue. -> Irrelevant for NAPI drivers in normal operation. 2) From net_rx_action(), but only under high stress or if RPS/RFS has a pending action. This patch changes net_rx_action() to perform the flush in all cases and after more urgent operations happened (like kicking remote CPUS for RPS/RFS). Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 19:37:49 -05:00
Linus Torvalds	86b01b5419	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull keys fixes from James Morris: "From David: - Fix mpi_powm()'s handling of a number with a zero exponent [CVE-2016-8650]. Integrate my and Andrey's patches for mpi_powm() and use mpi_resize() instead of RESIZE_IF_NEEDED() - the latter adds a duplicate check into the execution path of a trivial case we don't normally expect to be taken. - Fix double free in X.509 error handling" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: mpi: Fix NULL ptr dereference in mpi_powm() [ver #3] X.509: Fix double free in x509_cert_parse() [ver #3]	2016-11-25 15:53:45 -08:00
Linus Torvalds	cd3caefb46	Fix subtle CONFIG_MODVERSIONS problems CONFIG_MODVERSIONS has been broken for pretty much the whole 4.9 series, and quite frankly, nobody has cared very deeply. We absolutely know how to fix it, and it's not _complicated_, but it's not exactly pretty either. This oneliner fixes it without the ugliness, and allows for further future cleanups. "We've secretly replaced their regular MODVERSIONS with nothing at all, let's see if they notice" Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-11-25 15:44:47 -08:00
Linus Torvalds	beb53e4b23	ACPI fixes for v4.9-rc7 - Revert the recent commit that caused the ACPI _PTS method to be executed in the power-off/reboot code path (as per the specification) in an attempt to improve things on some systems (apparently expecting _PTS to be executed in that code path), but broke power-off/reboot on at least one other machine (Rafael Wysocki). - Fix kernel builds with the new WDAT watchdog driver enabled in some configurations by explicitly selecting WATCHDOG_CORE when enabling the WDAT watchdog driver (Mika Westerberg). -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAABCAAGBQJYOK+oAAoJEILEb/54YlRxMB8P/jJbhBsaIDzn/UJAQs37Qhc+ ERykoY5v8Knsg896IfFn/YsJbRFBozbuVTOif8ePpMoUMAkhJMYs5uejSK6WeZK8 l0haoPKJ/UrWp1Tc5zkdY3n/rUUivtmX2r4nMU+qEhAp+zz7q+peZTUecpisdOvv kpQG0imnnQbphIGex2wxorox7BYsBeNXPq0tbtez6dz7dzw3i5+1OGcPgHZ5vb/i +wiH1LKI83CRB9dFGVN9cVVHgai2/nAGw/ZWimhI1qbeXQG1ZwR1CU61cvO/46CJ pj0P40nuA4XzIQPigzsGM+UvuxTw4vm5Cg+I3IMjl0T+PoeE0Hp5JfR+2DJlJtj5 UeX0oi5yGJjjpWdX8MOtzLSCloo7Gyv5W3/5JhL6j0Wp415rM02iRvZJ/fse6aqM E461rlwSniHx/0lo2cDh49oPIxS72BqCJJiho4im/9whPVDNvwYeRUG2/5mytbIs /5ERG7dmnizwjKVddZK2X6MrT/u6AzfiAIvLcDi4hPfeqaVEEphcIozmv0wc0qGr 2mNv4WHsH/cqP9baB7Cy+p+5TM6J+x2kfBwOrWg8g7XqiN4wOHzslSrWtf0Bv0oH gPh+lIwJCqWZ0T8nbtEKm940xr34LnJuswMDhYiSnCJuPAE5Fk0dXfvLD2lGoD0R 7UpwOYlFWqhGGqarttqk =q/vx -----END PGP SIGNATURE----- Merge tag 'acpi-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fixes from Rafael Wysocki: "Two ACPI fixes for 4.9-rc7. One of them reverts a recent ACPI commit that attempted to improve reboot/power-off on some systems, but introduced problems elsewhere, and the other one fixes kernel builds with the new WDAT watchdog driver enabled in some configurations. Specifics: - Revert the recent commit that caused the ACPI _PTS method to be executed in the power-off/reboot code path (as per the specification) in an attempt to improve things on some systems (apparently expecting _PTS to be executed in that code path), but broke power-off/reboot on at least one other machine (Rafael Wysocki). - Fix kernel builds with the new WDAT watchdog driver enabled in some configurations by explicitly selecting WATCHDOG_CORE when enabling the WDAT watchdog driver (Mika Westerberg)" * tag 'acpi-4.9-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: watchdog: wdat_wdt: Select WATCHDOG_CORE Revert "ACPI: Execute _PTS before system reboot"	2016-11-25 15:16:51 -08:00
Rafael J. Wysocki	686564434e	MAINTAINERS: Add bug tracking system location entry type Following the kernel Bugzilla discussion during the Kernel Summit (https://lwn.net/Articles/705245/), add bug tracking system location entry type (B) to MAINTAINERS and populate it for several subsystems known to be using the kernel BZ actively (and add the upstream BZ for ACPICA too). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2016-11-25 15:16:28 -08:00
Jarkko Nikula	89119f0835	Revert "i2c: designware: do not disable adapter after transfer" This reverts commit `0317e6c0f1`. Srinivas reported recently touchscreen and touchpad stopped working in Haswell based machine in Linux 4.9-rc series with timeout errors from i2c_designware: [ 16.508013] i2c_designware INT33C3:00: controller timed out [ 16.508302] i2c_hid i2c-MSFT0001:02: failed to change power setting. [ 17.532016] i2c_designware INT33C3:00: controller timed out [ 18.556022] i2c_designware INT33C3:00: controller timed out [ 18.556315] i2c_hid i2c-ATML1000:00: failed to retrieve report from device. I managed to reproduce similar errors on another Haswell based machine where touchscreen initialization fails maybe in every 1/5 - 1/2 boots. Since root cause for these errors is not clear yet and debugging is ongoing it's better to revert this commit as we are near to release. Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2016-11-25 23:23:25 +01:00
David S. Miller	ca89fa77b4	Merge branch 'cgroup-bpf' Daniel Mack says: ==================== Add eBPF hooks for cgroups This is v9 of the patch set to allow eBPF programs for network filtering and accounting to be attached to cgroups, so that they apply to all sockets of all tasks placed in that cgroup. The logic also allows to be extendeded for other cgroup based eBPF logic. Again, only minor details are updated in this version. Changes from v8: * Move the egress hooks into ip_finish_output() and ip6_finish_output() so they run after the netfilter hooks. For IPv4 multicast, add a new ip_mc_finish_output() callback that is invoked on success by netfilter, and call the eBPF program from there. Changes from v7: * Replace the static inline function cgroup_bpf_run_filter() with two specific macros for ingress and egress. This addresses David Miller's concern regarding skb->sk vs. sk in the egress path. Thanks a lot to Daniel Borkmann and Alexei Starovoitov for the suggestions. Changes from v6: * Rebased to 4.9-rc2 * Add EXPORT_SYMBOL(__cgroup_bpf_run_filter). The kbuild test robot now succeeds in building this version of the patch set. * Switch from bpf_prog_run_save_cb() to bpf_prog_run_clear_cb() to not tamper with the contents of skb->cb[]. Pointed out by Daniel Borkmann. * Use sk_to_full_sk() in the egress path, as suggested by Daniel Borkmann. * Renamed BPF_PROG_TYPE_CGROUP_SOCKET to BPF_PROG_TYPE_CGROUP_SKB, as requested by David Ahern. * Added Alexei's Acked-by tags. Changes from v5: * The eBPF programs now operate on L3 rather than on L2 of the packets, and the egress hooks were moved from __dev_queue_xmit() to ip_output(). For BPF_PROG_TYPE_CGROUP_SOCKET, disallow direct access to the skb through BPF_LD_[ABS\|IND] instructions, but hook up the bpf_skb_load_bytes() access helper instead. Thanks to Daniel Borkmann for the help. Changes from v4: * Plug an skb leak when dropping packets due to eBPF verdicts in __dev_queue_xmit(). Spotted by Daniel Borkmann. * Check for sk_fullsock(sk) in __cgroup_bpf_run_filter() so we don't operate on timewait or request sockets. Suggested by Daniel Borkmann. * Add missing @parent parameter in kerneldoc of __cgroup_bpf_update(). Spotted by Rami Rosen. * Include linux/jump_label.h from bpf-cgroup.h to fix a kbuild error. Changes from v3: * Dropped the _FILTER suffix from BPF_PROG_TYPE_CGROUP_SOCKET_FILTER, renamed BPF_ATTACH_TYPE_CGROUP_INET_{E,IN}GRESS to BPF_CGROUP_INET_{IN,E}GRESS and alias BPF_MAX_ATTACH_TYPE to __BPF_MAX_ATTACH_TYPE, as suggested by Daniel Borkmann. * Dropped the attach_flags member from the anonymous struct for BPF attach operations in union bpf_attr. They can be added later on via CHECK_ATTR. Requested by Daniel Borkmann and Alexei. * Release old_prog at the end of __cgroup_bpf_update rather that at the beginning to fix a race gap between program updates and their users. Spotted by Daniel Borkmann. * Plugged an skb leak when dropping packets on the egress path. Spotted by Daniel Borkmann. * Add cgroups@vger.kernel.org to the loop, as suggested by Rami Rosen. * Some minor coding style adoptions not worth mentioning in particular. Changes from v2: * Fixed the RCU locking details Tejun pointed out. * Assert bpf_attr.flags == 0 in BPF_PROG_DETACH syscall handler. Changes from v1: * Moved all bpf specific cgroup code into its own file, and stub out related functions for !CONFIG_CGROUP_BPF as static inline nops. This way, the call sites are not cluttered with #ifdef guards while the feature remains compile-time configurable. * Implemented the new scheme proposed by Tejun. Per cgroup, store one set of pointers that are pinned to the cgroup, and one for the programs that are effective. When a program is attached or detached, the change is propagated to all the cgroup's descendants. If a subcgroup has its own pinned program, skip the whole subbranch in order to allow delegation models. * The hookup for egress packets is now done from __dev_queue_xmit(). * A static key is now used in both the ingress and egress fast paths to keep performance penalties close to zero if the feature is not in use. * Overall cleanup to make the accessors use the program arrays. This should make it much easier to add new program types, which will then automatically follow the pinned vs. effective logic. * Fixed locking issues, as pointed out by Eric Dumazet and Alexei Starovoitov. Changes to the program array are now done with xchg() and are protected by cgroup_mutex. * eBPF programs are now expected to return 1 to let the packet pass, not >= 0. Pointed out by Alexei. * Operation is now limited to INET sockets, so local AF_UNIX sockets are not affected. The enum members are renamed accordingly. In case other socket families should be supported, this can be extended in the future. * The sample program learned to support both ingress and egress, and can now optionally make the eBPF program drop packets by making it return 0. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:26:12 -05:00
Daniel Mack	d8c5b17f2b	samples: bpf: add userspace example for attaching eBPF programs to cgroups Add a simple userpace program to demonstrate the new API to attach eBPF programs to cgroups. This is what it does: * Create arraymap in kernel with 4 byte keys and 8 byte values * Load eBPF program The eBPF program accesses the map passed in to store two pieces of information. The number of invocations of the program, which maps to the number of packets received, is stored to key 0. Key 1 is incremented on each iteration by the number of bytes stored in the skb. * Detach any eBPF program previously attached to the cgroup * Attach the new program to the cgroup using BPF_PROG_ATTACH * Once a second, read map[0] and map[1] to see how many bytes and packets were seen on any socket of tasks in the given cgroup. The program takes a cgroup path as 1st argument, and either "ingress" or "egress" as 2nd. Optionally, "drop" can be passed as 3rd argument, which will make the generated eBPF program return 0 instead of 1, so the kernel will drop the packet. libbpf gained two new wrappers for the new syscall commands. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:26:04 -05:00
Daniel Mack	33b486793c	net: ipv4, ipv6: run cgroup eBPF egress programs If the cgroup associated with the receiving socket has an eBPF programs installed, run them from ip_output(), ip6_output() and ip_mc_output(). From mentioned functions we have two socket contexts as per `7026b1ddb6` ("netfilter: Pass socket pointer down through okfn()."). We explicitly need to use sk instead of skb->sk here, since otherwise the same program would run multiple times on egress when encap devices are involved, which is not desired in our case. eBPF programs used in this context are expected to either return 1 to let the packet pass, or != 1 to drop them. The programs have access to the skb through bpf_skb_load_bytes(), and the payload starts at the network headers (L3). Note that cgroup_bpf_run_filter() is stubbed out as static inline nop for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if the feature is unused. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:26:04 -05:00
Daniel Mack	c11cd3a6ec	net: filter: run cgroup eBPF ingress programs If the cgroup associated with the receiving socket has an eBPF programs installed, run them from sk_filter_trim_cap(). eBPF programs used in this context are expected to either return 1 to let the packet pass, or != 1 to drop them. The programs have access to the skb through bpf_skb_load_bytes(), and the payload starts at the network headers (L3). Note that cgroup_bpf_run_filter() is stubbed out as static inline nop for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if the feature is unused. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:26:04 -05:00
Daniel Mack	f432455148	bpf: add BPF_PROG_ATTACH and BPF_PROG_DETACH commands Extend the bpf(2) syscall by two new commands, BPF_PROG_ATTACH and BPF_PROG_DETACH which allow attaching and detaching eBPF programs to a target. On the API level, the target could be anything that has an fd in userspace, hence the name of the field in union bpf_attr is called 'target_fd'. When called with BPF_ATTACH_TYPE_CGROUP_INET_{E,IN}GRESS, the target is expected to be a valid file descriptor of a cgroup v2 directory which has the bpf controller enabled. These are the only use-cases implemented by this patch at this point, but more can be added. If a program of the given type already exists in the given cgroup, the program is swapped automically, so userspace does not have to drop an existing program first before installing a new one, which would otherwise leave a gap in which no program is attached. For more information on the propagation logic to subcgroups, please refer to the bpf cgroup controller implementation. The API is guarded by CAP_NET_ADMIN. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:26:04 -05:00
Daniel Mack	3007098494	cgroup: add support for eBPF programs This patch adds two sets of eBPF program pointers to struct cgroup. One for such that are directly pinned to a cgroup, and one for such that are effective for it. To illustrate the logic behind that, assume the following example cgroup hierarchy. A - B - C \ D - E If only B has a program attached, it will be effective for B, C, D and E. If D then attaches a program itself, that will be effective for both D and E, and the program in B will only affect B and C. Only one program of a given type is effective for a cgroup. Attaching and detaching programs will be done through the bpf(2) syscall. For now, ingress and egress inet socket filtering are the only supported use-cases. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:25:52 -05:00
Daniel Mack	0e33661de4	bpf: add new prog type for cgroup socket filtering This program type is similar to BPF_PROG_TYPE_SOCKET_FILTER, except that it does not allow BPF_LD_[ABS\|IND] instructions and hooks up the bpf_skb_load_bytes() helper. Programs of this type will be attached to cgroups for network filtering and accounting. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:25:52 -05:00
Rafael J. Wysocki	7e5c07af86	Merge branches 'acpi-sleep-fixes' and 'acpi-wdat-fixes' * acpi-sleep-fixes: Revert "ACPI: Execute _PTS before system reboot" * acpi-wdat-fixes: watchdog: wdat_wdt: Select WATCHDOG_CORE	2016-11-25 22:24:07 +01:00
David S. Miller	fb09c8c524	linux-can-fixes-for-4.9-20161123 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEES2FAuYbJvAGobdVQPTuqJaypJWoFAlg1ppETHG1rbEBwZW5n dXRyb25peC5kZQAKCRA9O6olrKklaqieB/4o9PaD8Rj38Piy7lJW1ahAxoUnY4AA Vu1eFvtidUswO5RV6mDOuqhTulzrMcPQZguW/S7eLZh6hWYVVLlgkrNLj/RpMXsH rqGRC/sL5ICL1q/ijYK6NJJ3+GFQhl92gG+wJxsQfETWVDKH13N3sWcEyBh0+C5P lnPFNVDVSy4bpkEgXAN/sfAvoHzW//34cnxTzlsd1COAWlxZ+HHgBAGp4kaYTpbF Vz3kuNPfDI7U+36quE8SUXe/R9HfqQBtfbFtaxha8vqH8Fw6MJYO0BUJVmtawTDq nBFvB/x+d0n1YeOgo5UD5bW9thItF57GEscWqYpTuhZ0jlPr5+CZeo14 =FImK -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-4.9-20161123' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-11-23 this is a pull request for net/master. The patch by Oliver Hartkopp for the broadcast manager (bcm) fixes the CAN-FD support, which may cause an out-of-bounds access otherwise. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:17:12 -05:00
Geliang Tang	f7db0ec957	dwc_eth_qos: drop duplicate headers Drop duplicate headers types.h and delay.h from dwc_eth_qos.c. Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:13:59 -05:00
Colin Ian King	619228d86b	cxgb4: fix memory leak on txq_info Currently if txq_info->uldtxq cannot be allocated then txq_info->txq is being kfree'd (which is redundant because it is NULL) instead of txq_info. Fix this by instead kfree'ing txq_info. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:09:50 -05:00
Linus Torvalds	f2051f8f9d	Received a copule of last minute fixes for v4.9 -----BEGIN PGP SIGNATURE----- iQIcBAABCAAGBQJYOByJAAoJEFGvii+H/Hdhgc0QAI7ij6fMPYey72ThnOAiFM8u X0KT7pOkMJ7Xz1SGpti+Qp/Ye/Hxu+CgDsfRgRcRoVVTqEcGDngGt0tb+2RtB7Js u/12Smd0+nqW61IP7weOKWDqYB38PcUGTgM7HFx3jc0QplYVrgrRlKR/3nQCIzGh yCgKurAMW2+1Rc2Emi/56Cz3uXwyisezgz9Zaya5eCaZtkpMAs9EbL76f5FkiCPU hR8mlWCURfv786xWy2sL1Pv2cBnuJECEPSPobepv4hkNJwsEbaLEGWQ4pHp76OUz gIeNtNOSbWs6a4MM3KnHNjDafy7J6ZDpEgbykNSQLK9AERmhsiaz0cMLV2mGvYgd 4ls/nUTyXLoZ/7hj2LGkT754b3Dg/miScs5w57fQLuQIF340abmmKJ54oSH9YmkF lunLvUmhn28aPHfSkgx9cruqoRXkcH/JDAlJe9M0sjZy5KXXgwmdX1+IuD9H2fNu 5+PAK0Zegj4Jj/gFhWHEBSgcBY6GQ3aqA/YZq4cFutO20dIwuPheskHmarFdGWDg 12BdRtACa7m8WuAlqrQ6pujvuL/7oVko/rxcR06WgQP2Y6PrKC1UCpy2tBXuJGH5 Iu1yGTRWDojOsYbZTa00cndRmWO5doc8w6JOErtqsxzZA7D9gk0ANeEKnXP/96Vd fKMvT+9misuD5OKquiTE =OXHK -----END PGP SIGNATURE----- Merge tag 'mfd-fixes-4.9.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd Pull MFD fixes from Lee Jones: "Received a copule of last minute fixes for v4.9. The patches from Viresh are fixing issues displayed in KernelCI" * tag 'mfd-fixes-4.9.1' of git://git.kernel.org/pub/scm/linux/kernel/git/lee/mfd: mfd: wm8994-core: Don't use managed regulator bulk get API mfd: wm8994-core: Disable regulators before removing them mfd: syscon: Support native-endian regmaps	2016-11-25 11:36:35 -08:00
Linus Torvalds	ea9ea6c6f5	media fixes for v4.9-rc7 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYOBAqAAoJEAhfPr2O5OEVJucQAJ9EGPFnupRUCQxqM0OGafSL /MY7cB18TLtRPVjJRNCm0aN9Y9di7iQGW+KxeWVA1S3DYeIi4AR1xaBkR5QUl9Vy +vOujTe37zwahMIyCCZYjZN4lEU6G2HjFebNplnBBTM4lPnYmKck/c+5eq9/W+yS BIb3tGI/wQ6247ku9NjmBzwz3Ekv0NKOp7uluPoiJ0EpRq2nDBaQByYjtzMlh8kh hzhffEs3l1MabaOefDwl6wVZii2LjzRP2sOwdfGS+c52/Bo3zbwD0TwC117weYLV BVymgIiWbxX3PhxjliZdsjxCUGkHvwrjS1kWVjYe6jclagZvwOyuALdRmjvS8g7p ps7AMDLV6GuUhmfo00Mcr2FXInx+DhIjPqFx85U1YckY7YhAAv8mLPbCt0Y8Ewcm 80afrZw4m2nYpM5NlUNl6CzyqOgZsOjwIoYqhi5hBspSYnQ9GOg2VEtHLdYjvnhp 25qbM+ZuO1WM/l1xBf1LT3wMSD0EObaP7dl+GimO+Nc7Qwqnt9WL6JB1xWUdwvWG l6EYmuMwFdq4V5usWVlwTTOUgva98KqpK6qIm85DUhM5CoqMPdOmn9dY/cWu27ck xUOnKP3IRmVkAKK3H+YJvggDWhz0uOI1ylQSCYKKLBwzMeVoWfW7xwIuJVu6m6P+ 8jrHSXNlD8Mu674Uc+2l =oseK -----END PGP SIGNATURE----- Merge tag 'media/v4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fix from Mauro Carvalho Chehab: "Fix for the firmware load logic of the tuner-xc2028 driver" * tag 'media/v4.9-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: xc2028: Fix use-after-free bug properly	2016-11-25 11:31:01 -08:00
Linus Torvalds	6006d6e719	amd, mediatek, exynos and hdlcd fixes. -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJYN7+wAAoJEAx081l5xIa++uYQAKPoS2yg+EQytgBmAVJxQ0Vi Z8O+z6rPOubHvXHqdAswNbZaxYxiL1RGyn1nYYNpLMTNOF2md8KrYvxc7irikcnC qxTCtZCkWRsUjzprxea+koaq3m6wBT4ToK+jl+mxVsSo/vCtYVWu/zgPnW9Drqiy Z9XfrwESDwMYK/tVCorucBNmzHvleJ+AngpsAyE6KZHy2eOyDiunIDMnAyPi9S2C KdC6jaHTv3p7Z/xyEw7gGvyqPFrX50kmphLY8O3+KBJfosifMuXLJV/p41iX0aI8 8P3lwQpb3h9Fv9urCXjXU+i8F3L7A6y4RuKCvZtKNGpas6jRrOGojkQJhD17EdvT N7D5ijIskCeoMFwaJ2kUMupY06rzt1CaNJAXpGbryR1bLUvyJysMWNDKHgPC5jUS MZIhqArUqht1DzFup25fx54CAAbb+4zOBiX9Jgq5zqqiq8i2XAWRoMT+fvJ6H5BL T819KKneKpS/uK61m8xPhopdOB1wU0rNvsOVYjXMxT4xTrbXODllO+Mm1eAuAPwh qfyCdhwspoDlaXwDFtkG9EyJpSeq6+A8DZS8DYz9wdcAhbHuy4kuMIYCeXCEfhsf xclxRVrzG3cWsu6QeEXdVjfVyN9fuhyzOJxkVzFVyfesiY0OHYoOY1AAouuO36q+ X2avUsWc8qtG6ap/rgWJ =N8S/ -----END PGP SIGNATURE----- Merge tag 'drm-fixes-for-v4.9-rc7' of git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "Seems to be quietening down nicely, a few mediatek, one exynos and one hdlcd fix, along with two amd fixes" * tag 'drm-fixes-for-v4.9-rc7' of git://people.freedesktop.org/~airlied/linux: gpu/drm/exynos/exynos_hdmi - Unmap region obtained by of_iomap drm/mediatek: fix null pointer dereference drm/mediatek: fixed the calc method of data rate per lane drm/mediatek: fix a typo of DISP_OD_CFG to OD_RELAYMODE drm/radeon: fix power state when port pm is unavailable (v2) drm/amdgpu: fix power state when port pm is unavailable drm/arm: hdlcd: fix plane base address update drm/amd/powerplay: avoid out of bounds access on array ps.	2016-11-25 10:51:35 -08:00

... 2 3 4 5 6 ...

636430 Commits