linux_dsm_epyc7002

mirror of https://github.com/AuxXxilium/linux_dsm_epyc7002.git synced 2024-12-28 11:18:45 +07:00

Author	SHA1	Message	Date
Jesse Brandeburg	0bcd952fee	ethernet/intel: consolidate NAPI and NAPI exit While reviewing code, I noticed that Eric Dumazet recommends that drivers check the return code of napi_complete_done, and use that to decide to enable interrupts or not when exiting poll. One of the Intel drivers was already fixed (ixgbe). Upon looking at the Intel drivers as a whole, we are handling our polling and NAPI exit in a few different ways based on whether we have multiqueue and whether we have Tx cleanup included. Several drivers had the bug of exiting NAPI with return 0, which appears to mess up the accounting in the stack. Consolidate all the NAPI routines to do best known way of exiting and to just mostly look like each other. 1) check return code of napi_complete_done to control interrupt enable 2) return the actual amount of work done. 3) return budget immediately if need NAPI poll again Tested the changes on e1000e with a high interrupt rate set, and it shows about an 8% reduction in the CPU utilization when busy polling because we aren't re-enabling interrupts when we're about to be polled. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-21 10:35:23 -08:00
Jesse Brandeburg	09e58b2d53	docs-networking: fix typo in define The #define for NETIF_F_GSO_UDP_L4 was incorrect in the documentation, fix it by making it match the actual code. Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-21 10:30:30 -08:00
Joe Perches	4df3c543a7	igb: Fix format with line continuation whitespace The line continuation unintentionally adds whitespace so instead use a coalesced format to remove the whitespace. Miscellanea: o Use a more typical style for ternaries and arguments for this logging message Signed-off-by: Joe Perches <joe@perches.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-21 10:22:10 -08:00
Dave Chinner	8c110d43c6	iomap: readpages doesn't zero page tail beyond EOF When we read the EOF page of the file via readpages, we need to zero the region beyond EOF that we either do not read or should not contain data so that mmap does not expose stale data to user applications. However, iomap_adjust_read_range() fails to detect EOF correctly, and so fsx on 1k block size filesystems fails very quickly with mapreads exposing data beyond EOF. There are two problems here. Firstly, when calculating the end block of the EOF byte, we have to round the size by one to avoid a block aligned EOF from reporting a block too large. i.e. a size of 1024 bytes is 1 block, which in index terms is block 0. Therefore we have to calculate the end block from (isize - 1), not isize. The second bug is determining if the current page spans EOF, and so whether we need split it into two half, one for the IO, and the other for zeroing. Unfortunately, the code that checks whether we should split the block doesn't actually check if we span EOF, it just checks if the read spans the /offset in the page/ that EOF sits on. So it splits every read into two if EOF is not page aligned, regardless of whether we are reading the EOF block or not. Hence we need to restrict the "does the read span EOF" check to just the page that spans EOF, not every page we read. This patch results in correct EOF detection through readpages: xfs_vm_readpages: dev 259:0 ino 0x43 nr_pages 24 xfs_iomap_found: dev 259:0 ino 0x43 size 0x66c00 offset 0x4f000 count 98304 type hole startoff 0x13c startblock 1368 blockcount 0x4 iomap_readpage_actor: orig pos 323584 pos 323584, length 4096, poff 0 plen 4096, isize 420864 xfs_iomap_found: dev 259:0 ino 0x43 size 0x66c00 offset 0x50000 count 94208 type hole startoff 0x140 startblock 1497 blockcount 0x5c iomap_readpage_actor: orig pos 327680 pos 327680, length 94208, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 331776 pos 331776, length 90112, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 335872 pos 335872, length 86016, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 339968 pos 339968, length 81920, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 344064 pos 344064, length 77824, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 348160 pos 348160, length 73728, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 352256 pos 352256, length 69632, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 356352 pos 356352, length 65536, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 360448 pos 360448, length 61440, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 364544 pos 364544, length 57344, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 368640 pos 368640, length 53248, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 372736 pos 372736, length 49152, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 376832 pos 376832, length 45056, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 380928 pos 380928, length 40960, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 385024 pos 385024, length 36864, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 389120 pos 389120, length 32768, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 393216 pos 393216, length 28672, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 397312 pos 397312, length 24576, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 401408 pos 401408, length 20480, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 405504 pos 405504, length 16384, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 409600 pos 409600, length 12288, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 413696 pos 413696, length 8192, poff 0 plen 4096, isize 420864 iomap_readpage_actor: orig pos 417792 pos 417792, length 4096, poff 0 plen 3072, isize 420864 iomap_readpage_actor: orig pos 420864 pos 420864, length 1024, poff 3072 plen 1024, isize 420864 As you can see, it now does full page reads until the last one which is split correctly at the block aligned EOF, reading 3072 bytes and zeroing the last 1024 bytes. The original version of the patch got this right, but it got another case wrong. The EOF detection crossing really needs to the the original length as plen, while it starts at the end of the block, will be shortened as up-to-date blocks are found on the page. This means "orig_pos + plen" no longer points to the end of the page, and so will not correctly detect EOF crossing. Hence we have to use the length passed in to detect this partial page case: xfs_filemap_fault: dev 259:1 ino 0x43 write_fault 0 xfs_vm_readpage: dev 259:1 ino 0x43 nr_pages 1 xfs_iomap_found: dev 259:1 ino 0x43 size 0x2cc00 offset 0x2c000 count 4096 type hole startoff 0xb0 startblock 282 blockcount 0x4 iomap_readpage_actor: orig pos 180224 pos 181248, length 4096, poff 1024 plen 2048, isize 183296 xfs_iomap_found: dev 259:1 ino 0x43 size 0x2cc00 offset 0x2cc00 count 1024 type hole startoff 0xb3 startblock 285 blockcount 0x1 iomap_readpage_actor: orig pos 183296 pos 183296, length 1024, poff 3072 plen 1024, isize 183296 Heere we see a trace where the first block on the EOF page is up to date, hence poff = 1024 bytes. The offset into the page of EOF is 3072, so the range we want to read is 1024 - 3071, and the range we want to zero is 3072 - 4095. You can see this is split correctly now. This fixes the stale data beyond EOF problem that fsx quickly uncovers on 1k block size filesystems. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-11-21 10:10:54 -08:00
Dave Chinner	494633fac7	vfs: vfs_dedupe_file_range() doesn't return EOPNOTSUPP It returns EINVAL when the operation is not supported by the filesystem. Fix it to return EOPNOTSUPP to be consistent with the man page and clone_file_range(). Clean up the inconsistent error return handling while I'm there. (I know, lipstick on a pig, but every little bit helps...) Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-11-21 10:10:54 -08:00
Dave Chinner	4721a60109	iomap: dio data corruption and spurious errors when pipes fill When doing direct IO to a pipe for do_splice_direct(), then pipe is trivial to fill up and overflow as it can only hold 16 pages. At this point bio_iov_iter_get_pages() then returns -EFAULT, and we abort the IO submission process. Unfortunately, iomap_dio_rw() propagates the error back up the stack. The error is converted from the EFAULT to EAGAIN in generic_file_splice_read() to tell the splice layers that the pipe is full. do_splice_direct() completely fails to handle EAGAIN errors (it aborts on error) and returns EAGAIN to the caller. copy_file_write() then completely fails to handle EAGAIN as well, and so returns EAGAIN to userspace, having failed to copy the data it was asked to. Avoid this whole steaming pile of fail by having iomap_dio_rw() silently swallow EFAULT errors and so do short reads. To make matters worse, iomap_dio_actor() has a stale data exposure bug bio_iov_iter_get_pages() fails - it does not zero the tail block that it may have been left uncovered by partial IO. Fix the error handling case to drop to the sub-block zeroing rather than immmediately returning the -EFAULT error. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-11-21 10:10:53 -08:00
Dave Chinner	b450672fb6	iomap: sub-block dio needs to zeroout beyond EOF If we are doing sub-block dio that extends EOF, we need to zero the unused tail of the block to initialise the data in it it. If we do not zero the tail of the block, then an immediate mmap read of the EOF block will expose stale data beyond EOF to userspace. Found with fsx running sub-block DIO sizes vs MAPREAD/MAPWRITE operations. Fix this by detecting if the end of the DIO write is beyond EOF and zeroing the tail if necessary. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-11-21 10:10:53 -08:00
Dave Chinner	0929d85800	iomap: FUA is wrong for DIO O_DSYNC writes into unwritten extents When we write into an unwritten extent via direct IO, we dirty metadata on IO completion to convert the unwritten extent to written. However, when we do the FUA optimisation checks, the inode may be clean and so we issue a FUA write into the unwritten extent. This means we then bypass the generic_write_sync() call after unwritten extent conversion has ben done and we don't force the modified metadata to stable storage. This violates O_DSYNC semantics. The window of exposure is a single IO, as the next DIO write will see the inode has dirty metadata and hence will not use the FUA optimisation. Calling generic_write_sync() after completion of the second IO will also sync the first write and it's metadata. Fix this by avoiding the FUA optimisation when writing to unwritten extents. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-11-21 10:10:53 -08:00
Dave Chinner	9230a0b65b	xfs: delalloc -> unwritten COW fork allocation can go wrong Long saga. There have been days spent following this through dead end after dead end in multi-GB event traces. This morning, after writing a trace-cmd wrapper that enabled me to be more selective about XFS trace points, I discovered that I could get just enough essential tracepoints enabled that there was a 50:50 chance the fsx config would fail at ~115k ops. If it didn't fail at op 115547, I stopped fsx at op 115548 anyway. That gave me two traces - one where the problem manifested, and one where it didn't. After refining the traces to have the necessary information, I found that in the failing case there was a real extent in the COW fork compared to an unwritten extent in the working case. Walking back through the two traces to the point where the CWO fork extents actually diverged, I found that the bad case had an extra unwritten extent in it. This is likely because the bug it led me to had triggered multiple times in those 115k ops, leaving stray COW extents around. What I saw was a COW delalloc conversion to an unwritten extent (as they should always be through xfs_iomap_write_allocate()) resulted in a /written extent/: xfs_writepage: dev 259:0 ino 0x83 pgoff 0x17000 size 0x79a00 offset 0 length 0 xfs_iext_remove: dev 259:0 ino 0x83 state RC\|LF\|RF\|COW cur 0xffff888247b899c0/2 offset 32 block 152 count 20 flag 1 caller xfs_bmap_add_extent_delay_real xfs_bmap_pre_update: dev 259:0 ino 0x83 state RC\|LF\|RF\|COW cur 0xffff888247b899c0/1 offset 1 block 4503599627239429 count 31 flag 0 caller xfs_bmap_add_extent_delay_real xfs_bmap_post_update: dev 259:0 ino 0x83 state RC\|LF\|RF\|COW cur 0xffff888247b899c0/1 offset 1 block 121 count 51 flag 0 caller xfs_bmap_add_ex Basically, Cow fork before: 0 1 32 52 +H+DDDDDDDDDDDD+UUUUUUUUUUU+ PREV RIGHT COW delalloc conversion allocates: 1 32 +uuuuuuuuuuuu+ NEW And the result according to the xfs_bmap_post_update trace was: 0 1 32 52 +H+wwwwwwwwwwwwwwwwwwwwwwww+ PREV Which is clearly wrong - it should be a merged unwritten extent, not an unwritten extent. That lead me to look at the LEFT_FILLING\|RIGHT_FILLING\|RIGHT_CONTIG case in xfs_bmap_add_extent_delay_real(), and sure enough, there's the bug. It takes the old delalloc extent (PREV) and adds the length of the RIGHT extent to it, takes the start block from NEW, removes the RIGHT extent and then updates PREV with the new extent. What it fails to do is update PREV.br_state. For delalloc, this is always XFS_EXT_NORM, while in this case we are converting the delayed allocation to unwritten, so it needs to be updated to XFS_EXT_UNWRITTEN. This LF\|RF\|RC case does not do this, and so the resultant extent is always written. And that's the bug I've been chasing for a week - a bmap btree bug, not a reflink/dedupe/copy_file_range bug, but a BMBT bug introduced with the recent in core extent tree scalability enhancements. Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-11-21 10:10:53 -08:00
Dave Chinner	2c307174ab	xfs: flush removing page cache in xfs_reflink_remap_prep On a sub-page block size filesystem, fsx is failing with a data corruption after a series of operations involving copying a file with the destination offset beyond EOF of the destination of the file: 8093(157 mod 256): TRUNCATE DOWN from 0x7a120 to 0x50000 ******WWWW 8094(158 mod 256): INSERT 0x25000 thru 0x25fff (0x1000 bytes) 8095(159 mod 256): COPY 0x18000 thru 0x1afff (0x3000 bytes) to 0x2f400 8096(160 mod 256): WRITE 0x5da00 thru 0x651ff (0x7800 bytes) HOLE 8097(161 mod 256): COPY 0x2000 thru 0x5fff (0x4000 bytes) to 0x6fc00 The second copy here is beyond EOF, and it is to sub-page (4k) but block aligned (1k) offset. The clone runs the EOF zeroing, landing in a pre-existing post-eof delalloc extent. This zeroes the post-eof extents in the page cache just fine, dirtying the pages correctly. The problem is that xfs_reflink_remap_prep() now truncates the page cache over the range that it is copying it to, and rounds that down to cover the entire start page. This removes the dirty page over the delalloc extent from the page cache without having written it back. Hence later, when the page cache is flushed, the page at offset 0x6f000 has not been written back and hence exposes stale data, which fsx trips over less than 10 operations later. Fix this by changing xfs_reflink_remap_prep() to use xfs_flush_unmap_range(). Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>	2018-11-21 10:10:53 -08:00
Shannon Nelson	b3c4d7c93e	ixgbe: add ipsec hw offload note to ixgbe Documentation Add a short note about using IPsec Hardware Offload with the ixgbe driver. Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-21 09:39:38 -08:00
Jens Axboe	14b04063cc	Merge branch 'nvme-4.20' of git://git.infradead.org/nvme into for-linus Pull NVMe fix from Christoph. * 'nvme-4.20' of git://git.infradead.org/nvme: nvme-fc: resolve io failures during connect	2018-11-21 05:56:28 -07:00
Rafael J. Wysocki	0db699f747	linux-cpupower-4.20-rc4 This cpupower update for Linux 4.20-rc4 consists of compile fixes to allow use of outside build flags and override of CFLAGS from Jiri Olsa, and fix to compilation with STATIC=true from Konstantin Khlebnikov. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAlv0cXUACgkQCwJExA0N Qxwjxg/8C9QnM5XrjEqlNz4ESwZ17RWEEZaGcteaUG3yFYtHMYnqzjzDJxW8yjMj XKI/v1IcMNxODoW61mKJ8QPCT4wOfL4Wi7uBHk48SzNCRXCNevFxOENXfDRCHYyH 0dKakVLRclR79n/TbH64WNRVUg4rkaZrpQGxNenDA2LJA4UW/ReUU2Dd8Qbyc6+p 1NA2wKONpLjoxSeyVjxu4Zz8mSucxLrTEzE7kElmqN0ZB6G5HE2yijCwBoTi9p95 mWqxRLDWKxnTZ5MDlS661RBw3tshTa2rtkv2QwijI23Pned5Z9imXDfZ8aadvntJ 8YJdScmhN53yJjRLf8idOchN24qI5RgHHyjvDjNJBrG85oFYuv2bDJSqjZUOX62V oXhRp9bjcw9Frhe70/+yKN/EfXKEaKqXlpMiuBraPh8e+UQwxZU/iOHbGogQyo5Z ot7fRJqiU8Cx7HplGwf3LO6sEXO7eOImBTVtYB3y4ctI5++ce/JACSDj7+RRE1/u VW2x40uzE0lwVDNxJslhNKVPqu/kG6HhVXURxsWpqiw22OI/DQXrt17nbTcFPt1j SDbxx2T+FgNPOjN69YwFzL7HjAqf6fdVwTQbfOiTA3ODA/i7nY9x/Y9FKqhctPj/ 0afbO0s2plqQ0dgYhJdu6tDEqdHycvXSsW6zcBqw0mNIyw+QkOI= =eU6f -----END PGP SIGNATURE----- Merge tag 'linux-cpupower-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux Pull cpupower utility updates for 4.20-rc4 from Shuah Khan: "This cpupower update for Linux 4.20-rc4 consists of compile fixes to allow use of outside build flags and override of CFLAGS from Jiri Olsa, and fix to compilation with STATIC=true from Konstantin Khlebnikov." * tag 'linux-cpupower-4.20-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux: tools cpupower: Override CFLAGS assignments tools cpupower debug: Allow to use outside build flags tools/power/cpupower: fix compilation with STATIC=true	2018-11-21 13:33:06 +01:00
Ville Syrjälä	f559156c39	drm/i915: Add rotation readout for plane initial config If we need to force a full plane update before userspace/fbdev have given us a proper plane state we should try to maintain the current plane state as much as possible (apart from the parts of the state we're trying to fix up with the plane update). To that end add basic readout for the plane rotation and maintain it during the initial fb takeover. Cc: Hans de Goede <hdegoede@redhat.com> Fixes: `516a49cc19` ("drm/i915: Fix assert_plane() warning on bootup with external display") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181120135450.3634-2-ville.syrjala@linux.intel.com Tested-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit `f43348a3db`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2018-11-21 14:30:58 +02:00
Ville Syrjälä	c773058dde	drm/i915: Force a LUT update in intel_initial_commit() If we force a plane update to fix up our half populated plane state we'll also force on the pipe gamma for the plane (since we always enable pipe gamma currently). If the BIOS hasn't programmed a sensible LUT into the hardware this will cause the image to become corrupted. Typical symptoms are a purple/yellow/etc. flash when the driver loads. To avoid this let's program something sensible into the LUT when we do the plane update. In the future I plan to add proper plane gamma enable readout so this is just a temporary measure. Cc: Hans de Goede <hdegoede@redhat.com> Fixes: `516a49cc19` ("drm/i915: Fix assert_plane() warning on bootup with external display") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181120135450.3634-1-ville.syrjala@linux.intel.com Tested-by: Hans de Goede <hdegoede@redhat.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `fa6af5145b`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2018-11-21 14:30:54 +02:00
Hans de Goede	2bbb5fa374	ACPI / platform: Add SMB0001 HID to forbidden_id_list Many HP AMD based laptops contain an SMB0001 device like this: Device (SMBD) { Name (_HID, "SMB0001") // _HID: Hardware ID Name (_CRS, ResourceTemplate () // _CRS: Current Resource Settings { IO (Decode16, 0x0B20, // Range Minimum 0x0B20, // Range Maximum 0x20, // Alignment 0x20, // Length ) IRQ (Level, ActiveLow, Shared, ) {7} }) } The legacy style IRQ resource here causes acpi_dev_get_irqresource() to be called with legacy=true and this message to show in dmesg: ACPI: IRQ 7 override to edge, high This causes issues when later on the AMD0030 GPIO device gets enumerated: Device (GPIO) { Name (_HID, "AMDI0030") // _HID: Hardware ID Name (_CID, "AMDI0030") // _CID: Compatible ID Name (_UID, Zero) // _UID: Unique ID Method (_CRS, 0, NotSerialized) // _CRS: Current Resource Settings { Name (RBUF, ResourceTemplate () { Interrupt (ResourceConsumer, Level, ActiveLow, Shared, ,, ) { 0x00000007, } Memory32Fixed (ReadWrite, 0xFED81500, // Address Base 0x00000400, // Address Length ) }) Return (RBUF) /* \_SB_.GPIO._CRS.RBUF */ } } Now acpi_dev_get_irqresource() gets called with legacy=false, but because of the earlier override of the trigger-type acpi_register_gsi() returns -EBUSY (because we try to register the same interrupt with a different trigger-type) and we end up setting IORESOURCE_DISABLED in the flags. The setting of IORESOURCE_DISABLED causes platform_get_irq() to call acpi_irq_get() which is not implemented on x86 and returns -EINVAL. resulting in the following in dmesg: amd_gpio AMDI0030:00: Failed to get gpio IRQ: -22 amd_gpio: probe of AMDI0030:00 failed with error -22 The SMB0001 is a "virtual" device in the sense that the only way the OS interacts with it is through calling a couple of methods to do SMBus transfers. As such it is weird that it has IO and IRQ resources at all, because the driver for it is not expected to ever access the hardware directly. The Linux driver for the SMB0001 device directly binds to the acpi_device through the acpi_bus, so we do not need to instantiate a platform_device for this ACPI device. This commit adds the SMB0001 HID to the forbidden_id_list, avoiding the instantiating of a platform_device for it. Not instantiating a platform_device means we will no longer call acpi_dev_get_irqresource() for the legacy IRQ resource fixing the probe of the AMDI0030 device failing. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1644013 BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=198715 BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=199523 Reported-by: Lukas Kahnert <openproggerfreak@gmail.com> Tested-by: Marc <suaefar@googlemail.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	2018-11-21 13:30:13 +01:00
Paul Kocialkowski	8fd3b90300	drm/fb-helper: Blacklist writeback when adding connectors to fbdev Writeback connectors do not produce any on-screen output and require special care for use. Such connectors are hidden from enumeration in DRM resources by default, but they are still picked-up by fbdev. This makes rather little sense since fbdev is not really adapted for dealing with writeback. Moreover, this is also a source of issues when userspace disables the CRTC (and associated plane) without detaching the CRTC from the connector (which is hidden by default). In this case, the connector is still using the CRTC, leading to am "enabled/connectors mismatch" and eventually the failure of the associated atomic commit. This situation happens with VC4 testing under IGT GPU Tools. Filter out writeback connectors in the fbdev helper to solve this. Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com> Reviewed-by: Boris Brezillon <boris.brezillon@bootlin.com> Reviewed-by: Maxime Ripard <maxime.ripard@bootlin.com> Tested-by: Maxime Ripard <maxime.ripard@bootlin.com> Fixes: `935774cd71` ("drm: Add writeback connector type") Cc: <stable@vger.kernel.org> # v4.19+ Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20181115163248.21168-1-paul.kocialkowski@bootlin.com	2018-11-21 10:38:19 +01:00
Chris Wilson	f8577fb3c2	drm/i915: Write GPU relocs harder with gen3 Under moderate amounts of GPU stress, we can observe on Bearlake and Pineview (later gen3 models) that we execute the following batch buffer before the write into the batch is coherent. Adding extra (tested with upto 32x) MI_FLUSH to either the invalidation, flush or both phases does not solve the incoherency issue with the relocations, but emitting the MI_STORE_DWORD_IMM twice does. So be it. Fixes: `7dd4f6729f` ("drm/i915: Async GPU relocation processing") Testcase: igt/gem_tiled_fence_blits # blb/pnv Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20181119154153.15327-1-chris@chris-wilson.co.uk (cherry picked from commit `7fa28e1469`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2018-11-21 09:32:08 +02:00
David S. Miller	11c6c0c228	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next-queue Jeff Kirsher says: ==================== 100GbE Intel Wired LAN Driver Updates 2018-11-20 This series contains updates to the ice driver only. Akeem updates the driver to determine whether or not to do auto-negotiation based on the VSI state. Bruce cleans up the control queue code to remove duplicate code. Take advantage of some compiler optimizations by making some structures constant, and also note that they cannot be modified. Cleaned up formatting issues and code comment that needed clarification. Fixed a potential NULL pointer dereference by adding a check. Jaroslaw adds a check to verify if memory was allocated or not. Yashaswini Raghuram fixes the driver to ensure we are not enabling the LAN_EN flag if the MAC in the MAC-VLAN is a unicast MAC, so that the unicast packets are not forwarded to the wire. Dave fixes the return value of ice_napi_poll() to be more useful in returning the work that was done and should only return 0 when no work was done. Anirudh does code comment cleanup, to make more consistent. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-20 20:59:27 -08:00
David S. Miller	51428fd661	Merge branch 'dsa-microchip-Modify-KSZ9477-DSA-driver-in-preparation-to-add-other-KSZ-switch-drivers' Tristram Ha says: ==================== net: dsa: microchip: Modify KSZ9477 DSA driver in preparation to add other KSZ switch drivers This series of patches is to modify the original KSZ9477 DSA driver so that other KSZ switch drivers can be added and use the common code. There are several steps to accomplish this achievement. First is to rename some function names with a prefix to indicate chip specific function. Second is to move common code into header that can be shared. Last is to modify tag_ksz.c so that it can handle many tail tag formats used by different KSZ switch drivers. ksz_common.c will contain the common code used by all KSZ switch drivers. ksz9477.c will contain KSZ9477 code from the original ksz_common.c. ksz9477_spi.c is renamed from ksz_spi.c. ksz9477_reg.h is renamed from ksz_9477_reg.h. ksz_common.h is added to provide common code access to KSZ switch drivers. ksz_spi.h is added to provide common SPI access functions to KSZ SPI drivers. v4 - Patches were removed to concentrate on changing driver structure without adding new code. v3 - The phy_device structure is used to hold port link information - A structure is passed in ksz_xmit and ksz_rcv instead of function pointer - Switch offload forwarding is supported v2 - Initialize reg_mutex before use - The alu_mutex is only used inside chip specific functions v1 - Each patch in the set is self-contained - Use ksz9477 prefix to indicate KSZ9477 specific code ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-20 20:57:12 -08:00
Tristram Ha	84bd190819	net: dsa: microchip: rename ksz_9477_reg.h to ksz9477_reg.h Rename ksz_9477_reg.h to ksz9477_reg.h for consistency as the product name is always KSZ####. Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-20 20:57:12 -08:00
Tristram Ha	c2e866911e	net: dsa: microchip: break KSZ9477 DSA driver into two files Break KSZ9477 DSA driver into two files in preparation to add more KSZ switch drivers. Add common functions in ksz_common.h so that other KSZ switch drivers can access code in ksz_common.c. Add ksz_spi.h for common functions used by KSZ switch SPI drivers. Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-20 20:57:12 -08:00
Tristram Ha	74a7194f15	net: dsa: microchip: rename ksz_spi.c to ksz9477_spi.c Rename ksz_spi.c to ksz9477_spi.c and update Kconfig in preparation to add more KSZ switch drivers. Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-20 20:57:12 -08:00
Tristram Ha	353592781d	net: dsa: microchip: rename some functions with ksz9477 prefix Rename some functions with ksz9477 prefix to separate chip specific code from common code. Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-20 20:57:12 -08:00
Tristram Ha	9bc981c355	net: dsa: microchip: clean up code Clean up code according to patch check suggestions. Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-20 20:57:12 -08:00
Tristram Ha	5b79c72e96	net: dsa: microchip: replace license with GPL Replace license with GPL. Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Reviewed-by: Woojung Huh <Woojung.Huh@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Acked-by: Pavel Machek <pavel@ucw.cz> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-20 20:57:11 -08:00
Yonghong Song	f6161a8f30	bpf: fix a compilation error when CONFIG_BPF_SYSCALL is not defined Kernel test robot (lkp@intel.com) reports a compilation error at https://www.spinics.net/lists/netdev/msg534913.html introduced by commit `838e96904f` ("bpf: Introduce bpf_func_info"). If CONFIG_BPF is defined and CONFIG_BPF_SYSCALL is not defined, the following error will appear: kernel/bpf/core.c:414: undefined reference to `btf_type_by_id' kernel/bpf/core.c:415: undefined reference to `btf_name_by_offset' When CONFIG_BPF_SYSCALL is not defined, let us define stub inline functions for btf_type_by_id() and btf_name_by_offset() in include/linux/btf.h. This way, the compilation failure can be avoided. Fixes: `838e96904f` ("bpf: Introduce bpf_func_info") Reported-by: kbuild test robot <lkp@intel.com> Cc: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 15:21:45 -08:00
Davide Caratti	f2cbd48528	net/sched: act_police: fix race condition on state variables after 'police' configuration parameters were converted to use RCU instead of spinlock, the state variables used to compute the traffic rate (namely 'tcfp_toks', 'tcfp_ptoks' and 'tcfp_t_c') are erroneously read/updated in the traffic path without any protection. Use a dedicated spinlock to avoid race conditions on these variables, and ensure proper cache-line alignment. In this way, 'police' is still faster than what we observed when 'tcf_lock' was used in the traffic path _ i.e. reverting commit `2d550dbad8` ("net/sched: act_police: don't use spinlock in the data path"). Moreover, we preserve the throughput improvement that was obtained after 'police' started using per-cpu counters, when 'avrate' is used instead of 'rate'. Changes since v1 (thanks to Eric Dumazet): - call ktime_get_ns() before acquiring the lock in the traffic path - use a dedicated spinlock instead of tcf_lock - improve cache-line usage Fixes: `2d550dbad8` ("net/sched: act_police: don't use spinlock in the data path") Reported-and-suggested-by: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com>	2018-11-20 14:59:58 -08:00
Linus Torvalds	c8ce94b8fe	A few MIPS fixes for 4.20: - Re-enable the Cavium Octeon USB driver in its defconfig after it was accidentally removed back in 4.14. - Have early memblock allocations be performed bottom-up to more closely match the behaviour we used to have with bootmem, which seems a safer choice since we've seen fallout from the change made in the 4.20 merge window. - Simplify max_low_pfn calculation in the NUMA code for the Loongson3 & SGI IP27 platforms to both clean up the code & ensure max_low_pfn has been set appropriately before it is used. -----BEGIN PGP SIGNATURE----- iIsEABYIADMWIQRgLjeFAZEXQzy86/s+p5+stXUA3QUCW/R1TRUccGF1bC5idXJ0 b25AbWlwcy5jb20ACgkQPqefrLV1AN19+gEAyjWhck3E/fJ38CEat3h8xg2zikjL maRJMMbD0S055eIA/jWhyjpTEseNTLKycpRWAF+3LU0YU2llb/Ui0IJBCP4O =r2Sa -----END PGP SIGNATURE----- Merge tag 'mips_fixes_4.20_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux Pull MIPS fixes from Paul Burton: "A few MIPS fixes for 4.20: - Re-enable the Cavium Octeon USB driver in its defconfig after it was accidentally removed back in 4.14. - Have early memblock allocations be performed bottom-up to more closely match the behaviour we used to have with bootmem, which seems a safer choice since we've seen fallout from the change made in the 4.20 merge window. - Simplify max_low_pfn calculation in the NUMA code for the Loongson3 and SGI IP27 platforms to both clean up the code & ensure max_low_pfn has been set appropriately before it is used" * tag 'mips_fixes_4.20_3' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux: MIPS: Loongson3,SGI-IP27: Simplify max_low_pfn calculation MIPS: Let early memblock_alloc*() allocate memories bottom-up MIPS: OCTEON: cavium_octeon_defconfig: re-enable OCTEON USB driver	2018-11-20 14:31:00 -08:00
Heiner Kallweit	b1d9823301	MAINTAINERS: add myself as co-maintainer for r8169 Meanwhile I know the driver quite well and I refactored bigger parts of it. As a result people contact me already with r8169 questions. Therefore I'd volunteer to become co-maintainer of the driver also officially. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-11-20 14:22:45 -08:00
Kenneth Feng	a5d0f45659	drm/amdgpu: Enable HDP memory light sleep Due to the register name and setting change of HDP memory light sleep on Vega20,change accordingly in the driver. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-20 14:40:15 -05:00
Bruce Allan	f25dad19ba	ice: Fix possible NULL pointer de-reference A recent update to smatch is causing it to report the error "we previously assumed 'm_entry->vsi_list_info' could be null". Fix that. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Anirudh Venkataramanan	d337f2afb7	ice: Use Tx\|Rx in comments In code comments, use Tx\|Rx instead of tx\|rx Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Anirudh Venkataramanan	df17b7e02f	ice: Cosmetic formatting changes 1. Fix several cases of double spacing 2. Fix typos 3. Capitalize abbreviations Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Bruce Allan	2c5492de87	ice: Cleanup short function signatures Function signatures that do not exceed 80-characters should be on a single line. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Bruce Allan	bc0c6fab8a	ice: Cleanup ice_tx_timeout() Clean up number of formatting issues and a comment that could use clarification. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Dave Ertman	e0c9fd9b77	ice: Fix return value from NAPI poll ice_napi_poll is hard-coded to return zero when it's done. It should instead return the work done (if any work was done). The only time it should return zero is if an interrupt or poll is handled and no work is performed. So change the return value to be the minimum of work done or budget-1. Signed-off-by: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Bruce Allan	55aa141ed9	ice: Constify global structures that can/should be Indicate these structs should not be modified and take advantage of some compiler optimizations by making these structs const. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Yashaswini Raghuram Prathivadi Bhayankaram	6a7e699369	ice: Do not set LAN_EN for MAC-VLAN filters In the action fields for a MAC-VLAN filter, do not set the LAN_EN flag if the MAC in the MAC-VLAN is unicast MAC. The unicast packets that match should not be forwarded to the wire. Signed-off-by: Yashaswini Raghuram Prathivadi Bhayankaram <yashaswini.raghuram.prathivadi.bhayankaram@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:04 -08:00
Jaroslaw Ilgiewicz	5fb597d7c8	ice: Pass the return value of ice_init_def_sw_recp() Added check of return value for ice_init_def_sw_recp(). Now we know if memory was correctly allocated. Signed-off-by: Jaroslaw Ilgiewicz <jaroslaw.ilgiewicz@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:03 -08:00
Bruce Allan	7afdbc903a	ice: Cleanup duplicate control queue code 1. Assigning the register offset and mask values contains duplicate code that can easily be replaced with a macro. 2. Separate functions for freeing send queue and receive queue rings are not needed; replace with a single function that uses a pointer to the struct ice_ctl_q_ring structure as a parameter instead of a pointer to the struct ice_ctl_q_info structure. 3. Initializing register settings for both send queue and receive queue contains duplicate code that can easily be replaced with a helper function. 4. Separate functions for freeing send queue and receive queue buffers are not needed; duplicate code can easily be replaced with a macro. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:03 -08:00
Akeem G Abodunrin	d38b08834f	ice: Do autoneg based on VSI state If VSI state is up, we should do autoneg with link up, otherwise with link down. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>	2018-11-20 11:39:03 -08:00
Alexei Starovoitov	740baecd81	Merge branch 'btf-func-info' Martin KaFai Lau says: ==================== The BTF support was added to kernel by Commit `69b693f0ae` ("bpf: btf: Introduce BPF Type Format (BTF)"), which introduced .BTF section into ELF file and is primarily used for map pretty print. pahole is used to convert dwarf to BTF for ELF files. This patch added func info support to the kernel so we can get better ksym's for bpf function calls. Basically, function call types are passed to kernel and the kernel extract function names from these types in order to contruct ksym for these functions. The llvm patch at https://reviews.llvm.org/D53736 will generate .BTF section and one more section .BTF.ext. The .BTF.ext section encodes function type information. The following is a sample output for selftests test_btf with file test_btf_haskv.o for translated insns and jited insns respectively. $ bpftool prog dump xlated id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): 0: (85) call pc+2#bpf_prog_2dcecc18072623fc_test_long_fname_1 1: (b7) r0 = 0 2: (95) exit int test_long_fname_1(struct dummy_tracepoint_args * arg): 3: (85) call pc+1#bpf_prog_89d64e4abf0f0126_test_long_fname_2 4: (95) exit int test_long_fname_2(struct dummy_tracepoint_args * arg): 5: (b7) r2 = 0 6: (63) (u32 )(r10 -4) = r2 7: (79) r1 = (u64 )(r1 +8) ... 22: (07) r1 += 1 23: (63) (u32 )(r0 +4) = r1 24: (95) exit $ bpftool prog dump jited id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): bpf_prog_b07ccb89267cf242__dummy_tracepoint: 0: push %rbp 1: mov %rsp,%rbp ...... 3c: add $0x28,%rbp 40: leaveq 41: retq int test_long_fname_1(struct dummy_tracepoint_args * arg): bpf_prog_2dcecc18072623fc_test_long_fname_1: 0: push %rbp 1: mov %rsp,%rbp ...... 3a: add $0x28,%rbp 3e: leaveq 3f: retq int test_long_fname_2(struct dummy_tracepoint_args * arg): bpf_prog_89d64e4abf0f0126_test_long_fname_2: 0: push %rbp 1: mov %rsp,%rbp ...... 80: add $0x28,%rbp 84: leaveq 85: retq Changelogs: v4 -> v5: . Add back BTF_KIND_FUNC_PROTO as v1 did. The difference is BTF_KIND_FUNC_PROTO cannot have t->name_off now. All param metadata is defined in BTF_KIND_FUNC_PROTO. BTF_KIND_FUNC must have t->name_off != 0 and t->type refers to a BTF_KIND_FUNC_PROTO. The above is the conclusion after the discussion between Edward Cree, Alexei, Daniel, Yonghong and Martin. v3 -> v4: . Remove BTF_KIND_FUNC_PROTO. BTF_KIND_FUNC is used for both function pointer and subprogram. The name_off field is used to distinguish both. . The record size is added to the func_info subsection in .BTF.ext to enable future extension. . The bpf_prog_info interface change to make it similar bpf_prog_load. . Related kernel and libbpf changes to accommodate the new .BTF.ext and kernel interface changes. v2 -> v3: . Removed kernel btf extern functions btf_type_id_func() and btf_get_name_by_id(). Instead, exposing existing functions btf_type_by_id() and btf_name_by_offset(). . Added comments about ELF section .BTF.ext layout. . Better codes in btftool as suggested by Edward Cree. v1 -> v2: . Added missing sign-off. . Limited the func_name/struct_member_name length for validity test. . Removed/changed several verifier messages. . Modified several commit messages to remove line_off reference. ==================== Acked-by: Edward Cree <ecree@solarflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 10:54:40 -08:00
Yonghong Song	254471e57a	tools/bpf: bpftool: add support for func types This patch added support to print function signature if btf func_info is available. Note that ksym now uses function name instead of prog_name as prog_name has a limit of 16 bytes including ending '\0'. The following is a sample output for selftests test_btf with file test_btf_haskv.o for translated insns and jited insns respectively. $ bpftool prog dump xlated id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): 0: (85) call pc+2#bpf_prog_2dcecc18072623fc_test_long_fname_1 1: (b7) r0 = 0 2: (95) exit int test_long_fname_1(struct dummy_tracepoint_args * arg): 3: (85) call pc+1#bpf_prog_89d64e4abf0f0126_test_long_fname_2 4: (95) exit int test_long_fname_2(struct dummy_tracepoint_args * arg): 5: (b7) r2 = 0 6: (63) (u32 )(r10 -4) = r2 7: (79) r1 = (u64 )(r1 +8) ... 22: (07) r1 += 1 23: (63) (u32 )(r0 +4) = r1 24: (95) exit $ bpftool prog dump jited id 1 int _dummy_tracepoint(struct dummy_tracepoint_args * arg): bpf_prog_b07ccb89267cf242__dummy_tracepoint: 0: push %rbp 1: mov %rsp,%rbp ...... 3c: add $0x28,%rbp 40: leaveq 41: retq int test_long_fname_1(struct dummy_tracepoint_args * arg): bpf_prog_2dcecc18072623fc_test_long_fname_1: 0: push %rbp 1: mov %rsp,%rbp ...... 3a: add $0x28,%rbp 3e: leaveq 3f: retq int test_long_fname_2(struct dummy_tracepoint_args * arg): bpf_prog_89d64e4abf0f0126_test_long_fname_2: 0: push %rbp 1: mov %rsp,%rbp ...... 80: add $0x28,%rbp 84: leaveq 85: retq Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 10:54:39 -08:00
Yonghong Song	999d82cbc0	tools/bpf: enhance test_btf file testing to test func info Change the bpf programs test_btf_haskv.c and test_btf_nokv.c to have two sections, and enhance test_btf.c test_file feature to test btf func_info returned by the kernel. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 10:54:39 -08:00
Yonghong Song	d7f5b5e051	tools/bpf: refactor to implement btf_get_from_id() in lib/bpf The function get_btf() is implemented in tools/bpf/bpftool/map.c to get a btf structure given a map_info. This patch refactored this function to be function btf_get_from_id() in tools/lib/bpf so that it can be used later. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 10:54:39 -08:00
Yonghong Song	9ce6ae22c8	tools/bpf: do not use pahole if clang/llvm can generate BTF sections Add additional checks in tools/testing/selftests/bpf and samples/bpf such that if clang/llvm compiler can generate BTF sections, do not use pahole. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 10:54:39 -08:00
Yonghong Song	2993e0515b	tools/bpf: add support to read .BTF.ext sections The .BTF section is already available to encode types. These types can be used for map pretty print. The whole .BTF will be passed to the kernel as well for which kernel can verify and return to the user space for pretty print etc. The llvm patch at https://reviews.llvm.org/D53736 will generate .BTF section and one more section .BTF.ext. The .BTF.ext section encodes function type information and line information. Note that this patch set only supports function type info. The functionality is implemented in libbpf. The .BTF section can be directly loaded into the kernel, and the .BTF.ext section cannot. The loader may need to do some relocation and merging, similar to merging multiple code sections, before loading into the kernel. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 10:54:39 -08:00
Yonghong Song	4798c4ba3b	tools/bpf: extends test_btf to test load/retrieve func_type info A two function bpf program is loaded with btf and func_info. After successful prog load, the bpf_get_info syscall is called to retrieve prog info to ensure the types returned from the kernel matches the types passed to the kernel from the user space. Several negative tests are also added to test loading/retriving of func_type info. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 10:54:39 -08:00
Yonghong Song	7e0d0fb552	tools/bpf: add new fields for program load in lib/bpf The new fields are added for program load in lib/bpf so application uses api bpf_load_program_xattr() is able to load program with btf and func_info data. This functionality will be used in next patch by bpf selftest test_btf. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Martin KaFai Lau <kafai@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2018-11-20 10:54:39 -08:00

... 3 4 5 6 7 ...

798167 Commits