Since we now have several options, in the help we print the list
of all supported options and a brief description of them.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some tests can fail with transports that have a slightly
different behavior, so let's add the possibility to specify
which tests to skip.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Before check if a send returns -EPIPE, we need to make sure the
connection is closed.
To do that, we use epoll API to wait EPOLLRDHUP or EPOLLHUP events
on the socket.
Reported-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The vsock_test.c program runs a test suite of AF_VSOCK test cases.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test cases will want to transfer data. This patch adds utility
functions to do this.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
See code comment for details.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Many test cases will need to connect to the server or accept incoming
connections. This patch extracts these operations into utility
functions that can be reused.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move useful functions into a separate file in preparation for more
vsock test programs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The vsock_diag_test program directly included ../../../include/uapi/
headers from the source tree. Tests are supposed to use the
usr/include/linux/ headers that have been prepared with make
headers_install instead.
Suggested-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove special taglen define KSZ8795_INGRESS_TAG_LEN
and use generic KSZ_INGRESS_TAG_LEN instead.
Signed-off-by: Michael Grzeschik <m.grzeschik@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Wiedmann says:
====================
s390/qeth: fixes 2019-12-18
please apply the following patch series to your net tree.
This brings two fixes for initialization / teardown issues, and one
ENOTSUPP cleanup.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
ENOTSUPP is not uapi, use EOPNOTSUPP instead.
Fixes: d66cb37e96 ("qeth: Add new priority queueing options")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When managing the promiscuous mode during an RX modeset, qeth caches the
current HW state to avoid repeated programming of the same state on each
modeset.
But while tearing down a device, we forget to clear the cached state. So
when the device is later set online again, the initial RX modeset
doesn't program the promiscuous mode since we believe it is already
enabled.
Fix this by clearing the cached state in the tear-down path.
Note that for the SBP variant of promiscuous mode, this accidentally
works right now because we unconditionally restore the SBP role while
re-initializing.
Fixes: 4a71df5004 ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Along with z/VM NICs, there's additional device types that only support
a specific transport mode (eg. external-bridged IQD).
Identify the corresponding error code, and raise a fitting error message
so that the user knows to adjust their device configuration.
On top of that also fix the subsequent error path, so that the rejected
cmd doesn't need to wait for a timeout but gets cancelled straight away.
Fixes: 4a71df5004 ("qeth: new qeth device driver")
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Oleksij Rempel says:
====================
add dsa switch support for ar9331
changes v6:
- remove ag71xx changes from this patch set. It needs more work.
- ar9331: fix register definition and add ASCII art switch documentation.
changes v6:
- rebase against net-next
changes v5:
- remote support for port5. The effort of using this port is
questionable. Currently, it is better to not use it at all, then
adding buggy support.
- remove port enable call back. There is nothing what we actually need
to enable.
- rebase it against v5.5-rc1
changes v4:
- ag71xx: ag71xx_mac_validate fix always false comparison (&& -> ||)
- tag_ar9331: use skb_pull_rcsum() instead of skb_pull().
- tag_ar9331: drop skb_set_mac_header()
changes v3:
- ag71xx: ag71xx_mac_config: ignore MLO_AN_INBAND mode. It is not
supported by HW and SW.
- ag71xx: ag71xx_mac_validate: return all supported bits on
PHY_INTERFACE_MODE_NA
changes v2:
- move Atheros AR9331 TAG format to separate patch
- use netdev_warn_once in the tag driver to reduce potential message spam
- typo fixes
- reorder tag driver alphabetically
- configure switch to maximal frame size
- use mdiobus_read/write
- fail if mdio sub node is not found
- add comment for post reset state
- remove deprecated comment about device id
- remove phy-handle option for node with fixed-link
- ag71xx: set 1G support only for GMII mode
This patch series provides dsa switch support for Atheros ar9331 WiSoC.
As side effect ag71xx needed to be ported to phylink to make the switch
driver (as well phylink based) work properly.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Provide basic support for Atheros AR9331 built-in switch. So far it
works as port multiplexer without any hardware offloading support.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for tag format used in Atheros AR9331 built-in switch.
Reviewed-by: Vivien Didelot <vivien.didelot@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add switch node supported by dsa ar9331 driver.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Atheros AR9331 has built-in 5 port switch. The switch can be configured
to use all 5 or 4 ports. One of built-in PHYs can be used by first built-in
ethernet controller or to be used directly by the switch over second ethernet
controller.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
When sample xdp_redirect_cpu was converted to use libbpf, the
tracepoints used by this sample were not getting attached automatically
like with bpf_load.c. The BPF-maps was still getting loaded, thus
nobody notice that the tracepoints were not updating these maps.
This fix doesn't use the new skeleton code, as this bug was introduced
in v5.1 and stable might want to backport this. E.g. Red Hat QA uses
this sample as part of their testing.
Fixes: bbaf6029c4 ("samples/bpf: Convert XDP samples to libbpf usage")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/157685877642.26195.2798780195186786841.stgit@firesoul
Jay Jayatheerthan says:
====================
This series of patches enhances xdpsock application with command line
parameters to set transmit packet size and fill pattern among other options.
The application has also been enhanced to use Linux Ethernet/IP/UDP header
structs and calculate IP and UDP checksums.
I have measured the performance of the xdpsock application before and after
this patch set and have not been able to detect any difference.
Packet Size:
------------
There is a new option '-s' or '--tx-pkt-size' to specify the transmit packet
size. It ranges from 47 to 4096 bytes. Default packet size is 64 bytes
which is same as before.
Fill Pattern:
-------------
The transmit UDP payload fill pattern is specified using '-P' or
'--tx-pkt-pattern'option. It is an unsigned 32 bit field and defaulted
to 0x12345678.
Packet Count:
-------------
The number of packets to send is specified using '-C' or '--tx-pkt-count'
option. If it is not specified, the application sends packets forever.
Batch Size:
-----------
The batch size for transmit, receive and l2fwd features of the application is
specified using '-b' or '--batch-size' options. Default value when this option
is not provided is 64 (same as before).
Duration:
---------
The application supports '-d' or '--duration' option to specify number of
seconds to run. This is used in tx, rx and l2fwd features. If this option is
not provided, the application runs for ever.
====================
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
The UDP payload fill pattern can be specified using '-P' or '--tx-pkt-pattern'
option. It is an unsigned 32 bit field and defaulted to 0x12345678.
The IP and UDP checksum is calculated by the code as per the content of
the packet before transmission.
Signed-off-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191220085530.4980-7-jay.jayatheerthan@intel.com
New option '-s' or '--tx-pkt-size' has been added to specify the transmit
packet size. The packet size ranges from 47 to 4096 bytes. When this
option is not provided, it defaults to 64 byte packet.
The code uses struct ethhdr, struct iphdr and struct udphdr to form the
transmit packet. The MAC address, IP address and UDP ports are set to default
values.
The code calculates IP and UDP checksums before sending the packet.
Checksum calculation code in Linux kernel is used for this purpose.
The Ethernet FCS is not filled by the code.
Signed-off-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191220085530.4980-6-jay.jayatheerthan@intel.com
Use '-C' or '--tx-pkt-count' to specify number of packets to send.
If it is not specified, the application sends packets forever. If packet
count is not a multiple of batch size, last batch sent is less than the
batch size.
Signed-off-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191220085530.4980-5-jay.jayatheerthan@intel.com
New option to specify batch size for tx, rx and l2fwd has been added. This
allows fine tuning to maximize performance. It is specified using '-b' or
'--batch_size' options. When not specified default is 64.
Signed-off-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191220085530.4980-4-jay.jayatheerthan@intel.com
Add code to do cleanup for signals and application completion in a unified
fashion. The signal handler sets benckmark_done flag terminating the
threads. The cleanup is called before returning from main() function.
Signed-off-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191220085530.4980-3-jay.jayatheerthan@intel.com
The application now supports '-d' or '--duration' option to specify number of
seconds to run. This is used in tx, rx and l2fwd features. If this option is
not provided, the application runs forever.
Signed-off-by: Jay Jayatheerthan <jay.jayatheerthan@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191220085530.4980-2-jay.jayatheerthan@intel.com
It's follow-up for discussion [1]
CHECK and CHECK_FAIL macros in test_progs.h can affect errno in some
circumstances, e.g. if some code accidentally closes stdout. It makes
checking errno in patterns like this unreliable:
if (CHECK(!bpf_prog_attach_xattr(...), "tag", "msg"))
goto err;
CHECK_FAIL(errno != ENOENT);
, since by CHECK_FAIL time errno could be affected not only by
bpf_prog_attach_xattr but by CHECK as well.
Fix it by saving and restoring errno in the macros. There is no "Fixes"
tag since no problems were discovered yet and it's rather precaution.
test_progs was run with this change and no difference was identified.
[1] https://lore.kernel.org/bpf/20191219210907.GD16266@rdna-mbp.dhcp.thefacebook.com/
Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/20191220000511.1684853-1-rdna@fb.com
Magnus Karlsson says:
====================
This patch set cleans up the ring access functions of AF_XDP in hope
that it will now be easier to understand and maintain. I used to get a
headache every time I looked at this code in order to really understand it,
but now I do think it is a lot less painful.
The code has been simplified a lot and as a bonus we get better
performance in nearly all cases. On my new 2.1 GHz Cascade Lake
machine with a standard default config plus AF_XDP support and
CONFIG_PREEMPT on I get the following results in percent performance
increases with this patch set compared to without it:
Zero-copy (-N):
rxdrop txpush l2fwd
1 core: -2% 0% 3%
2 cores: 4% 0% 3%
Zero-copy with poll() (-N -p):
rxdrop txpush l2fwd
1 core: 3% 0% 1%
2 cores: 21% 0% 9%
Skb mode (-S):
Shows a 0% to 5% performance improvement over the same benchmarks as
above.
Here 1 core means that we are running the driver processing and the
application on the same core, while 2 cores means that they execute on
separate cores. The applications are from the xdpsock sample app.
On my older 2.0 Ghz Broadwell machine that I used for the v1, I get
the following results:
Zero-copy (-N):
rxdrop txpush l2fwd
1 core: 4% 5% 4%
2 cores: 1% 0% 2%
Zero-copy with poll() (-N -p):
rxdrop txpush l2fwd
1 core: 1% 3% 3%
2 cores: 22% 0% 5%
Skb mode (-S):
Shows a 0% to 1% performance improvement over the same benchmarks as
above.
When a results says 21 or 22% better, as in the case of poll mode with
2 cores and rxdrop, my first reaction is that it must be a
bug. Everything else shows between 0% and 5% performance
improvement. What is giving rise to 22%? A quick bisect indicates that
it is patches 2, 3, 4, 5, and 6 that are giving rise to most of this
improvement. So not one patch in particular, but something around 4%
improvement from each one of them. Note that exactly this benchmark
has previously had an extraordinary slow down compared to when running
without poll syscalls. For all the other poll tests above, the
slowdown has always been around 4% for using poll syscalls. But with
the bad performing test in question, it was above 25%. Interestingly,
after this clean up, the slow down is 4%, just like all the other poll
tests. Please take an extra peek at this so I have not messed up
something.
The 0% for several txpush results are due to the test bottlenecking on
a non-CPU HW resource. If I eliminated that bottleneck on my system, I
would expect to see an increase there too.
Changes v1 -> v2:
* Corrected textual errors in the commit logs (Sergei and Martin)
* Fixed the functions that detect empty and full rings so that they
now operate on the global ring state (Maxim)
This patch has been applied against commit a352a82496 ("Merge branch 'libbpf-extern-followups'")
Structure of the patch set:
Patch 1: Eliminate the lazy update threshold used when preallocating
entries in the completion ring
Patch 2: Simplify the detection of empty and full rings
Patch 3: Consolidate the two local producer pointers into one
Patch 4: Standardize the naming of the producer ring access functions
Patch 5: Eliminate the Rx batch size used for the fill ring
Patch 6: Simplify the functions xskq_nb_avail and xskq_nb_free
Patch 7: Simplify and standardize the naming of the consumer ring
access functions
Patch 8: Change the names of the validation functions to improve
readability and also the return value of these functions
Patch 9: Change the name of xsk_umem_discard_addr() to
xsk_umem_release_addr() to better reflect the new
names. Requires a name change in the drivers that support AF_XDP
zero-copy.
Patch 10: Remove unnecessary READ_ONCE of data in the ring
Patch 11: Add overall function naming comment and reorder the functions
for easier reference
Patch 12: Use the struct_size helper function when allocating rings
====================
Reviewed-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Björn Töpel <bjorn.topel@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Add comments on how the ring access functions are named and how they
are supposed to be used for producers and consumers. The functions are
also reordered so that the consumer functions are in the beginning and
the producer functions in the end, for easier reference. Put this in a
separate patch as the diff might look a little odd, but no
functionality has changed in this patch.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-12-git-send-email-magnus.karlsson@intel.com
There are two unnecessary READ_ONCE of descriptor data. These are not
needed since the data is written by the producer before it signals
that the data is available by incrementing the producer pointer. As the
access to this producer pointer is serialized and the consumer always
reads the descriptor after it has read and synchronized with the
producer counter, the write of the descriptor will have fully
completed and it does not matter if the consumer has any read tearing.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-11-git-send-email-magnus.karlsson@intel.com
Change the name of xsk_umem_discard_addr to xsk_umem_release_addr to
better reflect the new naming of the AF_XDP queue manipulation
functions. As this functions is used by drivers implementing support
for AF_XDP zero-copy, it requires a name change to these drivers. The
function xsk_umem_release_addr_rq has also changed name in the same
fashion.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-10-git-send-email-magnus.karlsson@intel.com
Change the names of the validation functions to better reflect what
they are doing. The uppermost ones are reading entries from the rings
and only the bottom ones validate entries. So xskq_cons_read_ is a
better prefix name.
Also change the xskq_cons_read_ functions to return a bool
as the the descriptor or address is already returned by reference
in the parameters. Everyone is using the return value as a bool
anyway.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-9-git-send-email-magnus.karlsson@intel.com
Simplify and refactor consumer ring functions. The consumer first
"peeks" to find descriptors or addresses that are available to
read from the ring, then reads them and finally "releases" these
descriptors once it is done. The two local variables cons_tail
and cons_head are turned into one single variable called
cached_cons. cached_tail referred to the cached value of the
global consumer pointer and will be stored in cached_cons. For
cached_head, we just use cached_prod instead as it was not used
for a consumer queue before. It also better reflects what it
really is now: a cached copy of the producer pointer.
The names of the functions are also renamed in the same manner as
the producer functions. The new functions are called xskq_cons_
followed by what it does.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-8-git-send-email-magnus.karlsson@intel.com
At this point, there are no users of the functions xskq_nb_avail and
xskq_nb_free that take any other number of entries argument than 1, so
let us get rid of the second argument that takes the number of
entries.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-7-git-send-email-magnus.karlsson@intel.com
In the xsk consumer ring code there is a variable called RX_BATCH_SIZE
that dictates the minimum number of entries that we try to grab from
the fill and Tx rings. In fact, the code always try to grab the
maximum amount of entries from these rings. The only thing this
variable does is to throw an error if there is less than 16 (as it is
defined) entries on the ring. There is no reason to do this and it
will just lead to weird behavior from user space's point of view. So
eliminate this variable.
With this change, we will be able to simplify the xskq_nb_free and
xskq_nb_avail code in the next commit.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-6-git-send-email-magnus.karlsson@intel.com
Adopt the naming of the producer ring access functions to have a
similar naming convention as the functions in libbpf, but adapted to
the kernel. You first reserve a number of entries that you later
submit to the global state of the ring. This is much clearer, IMO,
than the one that was in the kernel part. Once renamed, we also
discover that two functions are actually the same, so remove one of
them. Some of the primitive ring submission operations are also the
same so break these out into __xskq_prod_submit that the upper level
ring access functions can use.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-5-git-send-email-magnus.karlsson@intel.com
Currently, the xsk ring code has two cached producer pointers:
prod_head and prod_tail. This patch consolidates these two into a
single one called cached_prod to make the code simpler and easier to
maintain. This will be in line with the user space part of the the
code found in libbpf, that only uses a single cached pointer.
The Rx path only uses the two top level functions
xskq_produce_batch_desc and xskq_produce_flush_desc and they both use
prod_head and never prod_tail. So just move them over to
cached_prod.
The Tx XDP_DRV path uses xskq_produce_addr_lazy and
xskq_produce_flush_addr_n and unnecessarily operates on both prod_tail
and prod_head, so move them over to just use cached_prod by skipping
the intermediate step of updating prod_tail.
The Tx path in XDP_SKB mode uses xskq_reserve_addr and
xskq_produce_addr. They currently use both cached pointers, but we can
operate on the global producer pointer in xskq_produce_addr since it
has to be updated anyway, thus eliminating the use of both cached
pointers. We can also remove the xskq_nb_free in xskq_produce_addr
since it is already called in xskq_reserve_addr. No need to do it
twice.
When there is only one cached producer pointer, we can also simplify
xskq_nb_free by removing one argument.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-4-git-send-email-magnus.karlsson@intel.com
In order to set the correct return flags for poll, the xsk code has to
check if the Rx queue is empty and if the Tx queue is full. This code
was unnecessarily large and complex as it used the functions that are
used to update the local state from the global state (xskq_nb_free and
xskq_nb_avail). Since we are not doing this nor updating any data
dependent on this state, we can simplify the functions. Another
benefit from this is that we can also simplify the xskq_nb_free and
xskq_nb_avail functions in a later commit.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-3-git-send-email-magnus.karlsson@intel.com
The lazy update threshold was introduced to keep the producer and
consumer some distance apart in the completion ring. This was
important in the beginning of the development of AF_XDP as the ring
format as that point in time was very sensitive to the producer and
consumer being on the same cache line. This is not the case
anymore as the current ring format does not degrade in any noticeable
way when this happens. Moreover, this threshold makes it impossible
to run the system with rings that have less than 128 entries.
So let us remove this threshold and just get one entry from the ring
as in all other functions. This will enable us to remove this function
in a later commit. Note that xskq_produce_addr_lazy followed by
xskq_produce_flush_addr_n are still not the same function as
xskq_produce_addr() as it operates on another cached pointer.
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/1576759171-28550-2-git-send-email-magnus.karlsson@intel.com
Under heavy loads where the kyber I/O scheduler hits the token limits for
its scheduling domains, kyber can become stuck. When active requests
complete, kyber may not be woken up leaving the I/O requests in kyber
stuck.
This stuck state is due to a race condition with kyber and the sbitmap
functions it uses to run a callback when enough requests have completed.
The running of a sbt_wait callback can race with the attempt to insert the
sbt_wait. Since sbitmap_del_wait_queue removes the sbt_wait from the list
first then sets the sbq field to NULL, kyber can see the item as not on a
list but the call to sbitmap_add_wait_queue will see sbq as non-NULL. This
results in the sbt_wait being inserted onto the wait list but ws_active
doesn't get incremented. So the sbitmap queue does not know there is a
waiter on a wait list.
Since sbitmap doesn't think there is a waiter, kyber may never be
informed that there are domain tokens available and the I/O never advances.
With the sbt_wait on a wait list, kyber believes it has an active waiter
so cannot insert a new waiter when reaching the domain's full state.
This race can be fixed by only adding the sbt_wait to the queue if the
sbq field is NULL. If sbq is not NULL, there is already an action active
which will trigger the re-running of kyber. Let it run and add the
sbt_wait to the wait list if still needing to wait.
Reviewed-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
Reported-by: John Pittman <jpittman@redhat.com>
Tested-by: John Pittman <jpittman@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
- Leftover put_cpu() in the perf/smmuv3 error path.
- Add Hisilicon TSV110 to spectre-v2 safe list
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE5RElWfyWxS+3PLO2a9axLQDIXvEFAl39DvwACgkQa9axLQDI
XvFgIxAAkmYKNeK6EpnxiMKQANeD77X0qPcJQWsiG5Qu7Xc5AybnVlZGUPA9yVxL
jZPyYtBULElZsQ6vbdmz+kXga4j51jVkvKt2ZTmnQvuEFDRY3Fby1lPWfrGGmXBX
YLU9OpLTs7DacfOLey9KuEoKYd2fqnetY+hDH07UhhY+VZFpH7fnFz03vxoX9mvs
dar4y+1fH5mbSfLAkprRR+LeNHUxYIkxAVZlSelHvcjr6epaniuv/uk4/bsgaoVS
mnZ7gLDoUwJ4btV1w/iY55RRUp5lRO6igBi49DFJyOmoSltfeFEA7KGJC4K7F+hp
qeO/9t3fCauCQCRCW6oSZmLbyiZyvLVBs0YcVrrsdfDjhwStygDhbJP/TIRZhcfC
5PAcOt/8nbU04j5+kiGTMu9Enx1htZ6wT8ycj2hQcEH4Xenixh8T4JDEIo1fNitJ
6GYhEBw8zukYouZO22ppcLALF1XKZ0MrDthhcLf7YSK58mtdXUxmont9mfA8v6Po
iVty3ZmMnZGrqk1DgFKC2pEtZRCNx/vSzTzRpQffDntE0vsq+L9pqgG2mapM230w
Wo4qZyKiRCfecD+PZeDvPaKzMAA4xMr32YFi3V/OK3OWoqkg/jRZaCi58xsDEzfG
QmZhNlrHPZe3ObPGaUB+zer1c+ZJ3KG2s4bNk78zizcZ8IHzgYY=
=+P0H
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:
- Leftover put_cpu() in the perf/smmuv3 error path.
- Add Hisilicon TSV110 to spectre-v2 safe list
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: cpu_errata: Add Hisilicon TSV110 to spectre-v2 safe list
perf/smmuv3: Remove the leftover put_cpu() in error path
exynos:
- component delete fix
i915:
- Fix to drop an unused and harmful display W/A
- Fix to define EHL power wells independent of ICL
- Fix for priority inversion on bonded requests
- Fix in mmio offset calculation of DSB instance
- Fix memory leak from get_task_pid when banning clients
- Fixes to avoid dereference of uninitialized ops in dma_fence tracing
and keep reference to execbuf object until submitted.
- vGPU state setting locking fix (Zhenyu)
- Fix vGPU display dmabuf as read-only (Zhenyu)
- Properly handle vGPU display dmabuf page pin when rendering (Tina)
- Fix one guest boot warning to handle guc reset state (Fred)
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJd/S2DAAoJEAx081l5xIa+lfAP+wdMacLoJtgBnsXR/P3ezVME
xjb+YWeXYOeJ/8PXKqtwZCsdmOOQM3CVp/rDKn0z/L2EKkY790aeIXuGd6PtKQKO
ZO9x6oeXQfV780HA5SZzUaoazyWtVVT72t0SOqr0NqRCqWJ/21ZGqX+v5axQsx1E
kkgwI+TgbYXGQ+wBNnm2P0WZQgE8j5k0YaZxmG4faButsYAuaJe3w6dqWFhbjnwz
9qHDGajU2y6d3JMAoCcza/vV4iOEkAHCkqr63y2/ryTpH/5QuH8n4ceSWgoCv2G4
G/kIvMa62Ob1iXf353fxtV64SR+z0NSEURMvjqo64Vr0lF+tklcPEDIaYcnp/D5N
4elyHHUddOIHOuwaczB930/aZJ90EHgESgiXQfqks1fuWldj2k3V/mI5PvCh2HtE
G2vPSjdh6jvbqeKmGDsWkoleekPlJgpOKFQXA2HyI2NK5/O476jI5meY9CCYtMf3
K05YHLF3ltzrwPwagUqz3wWRNWj5TqMlwk62V5GeOVtnagPkFjKq31ndI3TxJO1L
gSq1YoeD4TsWisdGxa1uQbSjJ8Z3c3OHTNaUnsD7gOOgxQXwPcdyvPoS3Q5GNW67
6rPhzZRKwVmaOfFQ8K3Ch5z67D0VfSS4t8V5P6owFOGm9oQEvkyeh+CMo80IL+7e
GoKvFw0ct72ejbms5ept
=RFC2
-----END PGP SIGNATURE-----
Merge tag 'drm-fixes-2019-12-21' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Dave Airlie:
"Probably the last one before Christmas, I'll see if there is much
demand over next few weeks for more fixes, I expect it'll be quiet
enough.
This has one exynos fix, and a bunch of i915 core and i915 GVT fixes.
Summary:
exynos:
- component delete fix
i915:
- Fix to drop an unused and harmful display W/A
- Fix to define EHL power wells independent of ICL
- Fix for priority inversion on bonded requests
- Fix in mmio offset calculation of DSB instance
- Fix memory leak from get_task_pid when banning clients
- Fixes to avoid dereference of uninitialized ops in dma_fence
tracing and keep reference to execbuf object until submitted.
- vGPU state setting locking fix (Zhenyu)
- Fix vGPU display dmabuf as read-only (Zhenyu)
- Properly handle vGPU display dmabuf page pin when rendering (Tina)
- Fix one guest boot warning to handle guc reset state (Fred)"
* tag 'drm-fixes-2019-12-21' of git://anongit.freedesktop.org/drm/drm:
drm/exynos: gsc: add missed component_del
drm/i915: Fix pid leak with banned clients
drm/i915/gem: Keep request alive while attaching fences
drm/i915: Fix WARN_ON condition for cursor plane ddb allocation
drm/i915/gvt: Fix guest boot warning
drm/i915/tgl: Drop Wa#1178
drm/i915/ehl: Define EHL powerwells independently of ICL
drm/i915: Set fence_work.ops before dma_fence_init
drm/i915: Copy across scheduler behaviour flags across submit fences
drm/i915/dsb: Fix in mmio offset calculation of DSB instance
drm/i915/gvt: Pin vgpu dma address before using
drm/i915/gvt: set guest display buffer as readonly
drm/i915/gvt: use vgpu lock for active state setting
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl39CowQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpufZD/4t5p6e5S1GO915Y35+q8ooOjd7Ci4a+QJh
mYV1KVlrvt+uPXSMHZoEj2JQhI3quEb2IHmUE5IydFBfIwJl2soD7mAsky2iQNaC
ULMQFCW33vVnfz7WyuHwkEHmdgEuKg8OeGWVSMEsjrFqygHYSWR94wmqJiYKpkgm
Klw4guyGzVfjHutxEDRM3QzHbHmy9xwSNDpJR9Vyr/s0GPOLCavpE71/1ztoc3mq
UbolvEirXwUgGNArC/YyHhJAMM+lWNYplWBdGM3YrKzmV2oqKQY9+148IOWeV3Yl
vmHLX0/s2WsbKnZPqE5DeuDc8X1fspJHcrQDn+BeM8w8TdGaSUHxgMIuyFulDzWr
+3cDjVGaKo3J41xnX7u7v3ph8qnDjMz6k6o6IaQtWz7MCwzJKpCEyF/dJcDIJfxU
7gaOnP5ltDf5wJWzfsOCYFA3/CTL2mshdHD4lg0siEp7CksfX6BFGoLNqdk5qhuv
0md3X0nMkTSsXd09tRjYBQUaKOJ9NnvD63bGIcSshUAMZ2JQUOcFdPNfkSwb3jq+
OMFnz/t6C8VOnyLwBYleYr8r5bum80lVzwvDa4LZNGivyeD/ne1HO+22WA5xDwod
8yNm/hBhy5FGtoucQU2Vo2P9SiAil586PLz9HxAtD9eUOBTLQxVOOrv9x8vVhAW3
Jln/sNGwYg==
=6pRI
-----END PGP SIGNATURE-----
Merge tag 'io_uring-5.5-20191220' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:
"Here's a set of fixes that should go into 5.5-rc3 for io_uring.
This is bigger than I'd like it to be, mainly because we're fixing the
case where an application reuses sqe data right after issue. This
really must work, or it's confusing. With 5.5 we're flagging us as
submit stable for the actual data, this must also be the case for
SQEs.
Honestly, I'd really like to add another series on top of this, since
it cleans it up considerable and prevents any SQE reuse by design. I
posted that here:
https://lore.kernel.org/io-uring/20191220174742.7449-1-axboe@kernel.dk/T/#u
and may still send it your way early next week once it's been looked
at and had some more soak time (does pass all regression tests). With
that series, we've unified the prep+issue handling, and only the prep
phase even has access to the SQE.
Anyway, outside of that, fixes in here for a few other issues that
have been hit in testing or production"
* tag 'io_uring-5.5-20191220' of git://git.kernel.dk/linux-block:
io_uring: io_wq_submit_work() should not touch req->rw
io_uring: don't wait when under-submitting
io_uring: warn about unhandled opcode
io_uring: read opcode and user_data from SQE exactly once
io_uring: make IORING_OP_TIMEOUT_REMOVE deferrable
io_uring: make IORING_OP_CANCEL_ASYNC deferrable
io_uring: make IORING_POLL_ADD and IORING_POLL_REMOVE deferrable
io_uring: make HARDLINK imply LINK
io_uring: any deferred command must have stable sqe data
io_uring: remove 'sqe' parameter to the OP helpers that take it
io_uring: fix pre-prepped issue with force_nonblock == true
io-wq: re-add io_wq_current_is_worker()
io_uring: fix sporadic -EFAULT from IORING_OP_RECVMSG
io_uring: fix stale comment and a few typos
- Fix to drop an unused and harmful display W/A
- Fix to define EHL power wells independent of ICL
- Fix for priority inversion on bonded requests
- Fix in mmio offset calculation of DSB instance
- Fix memory leak from get_task_pid when banning clients
- Fixes to avoid dereference of uninitialized ops in dma_fence tracing
and keep reference to execbuf object until submitted.
- Includes gvt-fixes-2019-12-18
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191219124635.GA16068@jlahtine-desk.ger.corp.intel.com
. Make sure to unregister a component for Exynos gscaler driver
when the driver is removed.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJd+r4lAAoJEFc4NIkMQxK4igAP/3ttbkx2mY+ymhFkDvdji6JA
2gZPHmgp8luuVA4aMTwCNOteKcxvi7cA5j1ogwDIFShnSJhhvN4eWZVQPmN2XT6G
iM7io6zQB55UE39Kce3P2fYNAGmV8qS5k211s/C1NcC0k2Q5uhqcA1/n1tEKz/ga
6WMEnGrSZkfz2JJAxs4lcE1pJ4MHTqvdI9wgNmmW5t7pf2Z/Vbyr5lQhsbZV1Oj4
VhCEtTB7sqCTnOMh5sWpapqLBanpHXtPi+sxUR9L+ujVQmRV3u41sdylz4ALMuGA
31SmM/VpmDO8/6w3X1bi6u58ba/0JNDy2/v3Bv2f85Fs2NXYDkdo9pmRrLst5vsq
SftIgkhE+o/K+iMiq8n6R1+oWsBi9Q0pOtkZYRHuIguGpr6N+jF1tHMWZ4h7Cauo
BRyVN60cw+4bEOLmVRV/MPXm/bsCgs2Wpajq4BMLapmTPSSRbR0/uXsNqLdBvJZ0
Qa9PJ33hsw6U+o6bpDe7BMv75AsCrJoKsTKvLo3Igtu/gMTjVYWuGXmbUJMaPzqC
5zs51Zh8+WAiBMGMlXRTe8/fE9nx1kQreZTocpF9BN5EjSJW8cl3SSfcj4cmCHPd
Z7e2NFfoiI05KTDTUno56nGEHM3dm0qHzFoWXdgd1v6Qv/8FDHGCDsIrXrzm7a1s
dBkPvC2NUZWIiX9nVZUf
=M6lz
-----END PGP SIGNATURE-----
Merge tag 'exynos-drm-fixes-for-v5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-fixes
Just one bug fixup
. Make sure to unregister a component for Exynos gscaler driver
when the driver is removed.
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Inki Dae <inki.dae@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1576714013-3788-1-git-send-email-inki.dae@samsung.com