This patch should help to improve the performance of the mlx4 and mlx5 on a
number of architectures. For example, on x86 the dma_wmb/rmb equates out
to a barrer() call as the architecture is already strong ordered, and on
PowerPC the call works out to a lwsync which is significantly less expensive
than the sync call that was being used for wmb.
I placed the new barriers between any spots that seemed to be trying to
order memory/memory reads or writes, if there are any spots that involved
MMIO I left the existing wmb in place as the new barriers cannot order
transactions between coherent and non-coherent memories.
v2: Reduced the replacments to just the spots where I could clearly
identify the usage pattern.
Cc: Amir Vadai <amirv@mellanox.com>
Cc: Ido Shamay <idos@mellanox.com>
Cc: Eli Cohen <eli@mellanox.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update the Chelsio Ethernet drivers to use the dma_rmb/wmb calls instead of
the full barriers in order to improve performance.
Cc: Santosh Raspatur <santosh@chelsio.com>
Cc: Hariprasad S <hariprasad@chelsio.com>
Cc: Casey Leedom <leedom@chelsio.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In socfpga_dwmac_parse_data forward error code from devm_reset_control_get.
This gives the driver another chance to laod if altr,rst-mgr is loaded after
the network driver.
Signed-off-by: Phil Reid <preid@electromag.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Netlink attribute for the power is s8. But for the driver level
operations we are collection power level value into integer.
It has to be change to s8 from int.
Signed-off-by: Varka Bhadram <varkab@cdac.in>
Acked-by: Alexander Aring <alex.aring@gmail.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
The Intel vendors events indicating firmware loading result and the
bootup of the operational firmware are currently hardcoded byte
comparisons. So intead of doing that, provide proper data structures
and actually use them.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
When establishing a Bluetooth LE connection, read the remote used
features mask to determine which features are supported. This was
not really needed with Bluetooth 4.0, but since Bluetooth 4.1 and
also 4.2 have introduced new optional features, this becomes more
important.
This works the same as with BR/EDR where the connection enters the
BT_CONFIG stage and hci_connect_cfm call is delayed until the remote
features have been retrieved. Only after successfully receiving the
remote features, the connection enters the BT_CONNECTED state.
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
For kernel_sendmsg() that eliminates the need to play with setfs();
for kernel_recvmsg() it does *not* - a couple of callers are using
it with non-NULL ->msg_control, which would be treated as userland
address on recvmsg side of things.
In all cases we are really setting a kvec-backed iov_iter, though.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
trivial conflict in net/socket.c and non-trivial one in crypto -
that one had evaded aio_complete() removal.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The numPosted field in the ERX Doorbell register is 8-bits wide.
So the max buffers that we can post at a time is 255 and not 256
which we are doing currently.
Signed-off-by: Ajit Khaparde <ajit.khaparde@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
FastOpen requests are not like other regular request sockets.
They do not yet use rsk_timer : tcp_fastopen_queue_check()
simply manually removes one expired request from fastopenq->rskq_rst
list.
Therefore, tcp_check_req() must not call mod_timer_pending(),
otherwise we crash because rsk_timer was not initialized.
Fixes: fa76ce7328 ("inet: get rid of central tcp/dccp listener timer")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This is the NFC pull request for 4.1.
This is a shorter one than usual, as the Intel Field Peak NFC
driver could not make it in time.
We have:
- A new driver for NXP NCI based chipsets, like e.g. the NPC100 or
the PN7150. It currently only supports an i2c physical layer, but
could easily be extended to work on top of e.g. SPI.
This driver also includes support for user space triggered firmware
updates.
- A few minor st21nfc[ab] fixes, cleanups, and comments improvements.
- A pn533 error return fix.
- A few NFC related logs formatting cleanups.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJVJV81AAoJEIqAPN1PVmxKM3sP/R6fKXMxtVxXsiPzBMk+SpBI
onNXbCx27rp1lHTFznE16JAaaSfcQEwSQmYx7xa6KXqRYVlDfRfC+5R+rPGrCjfH
kV/eLBNnYpmPw0ViVT7dsWK0b4me0k5pr9ki9mze3YxvuMbA5vdv0tvuRFz5IRF/
hl+WI5pntGuRtnIyBKIauAMylylUYVvCBGhmHnveiX0Dp8noJLBSV0wvdzujm51S
+Uio/jHlUEV2lotrQBOfWNtEkwonXSwzZWSzimBCyEGLAwTx4lGXHmQftOtPd3zE
sOT7Gw77ZCsulHoHiJyC0KpDSDS0NYVrtTI5BiuGgAivGi0YZw2XD3CYRIdy0m2Z
lMoodYdiCgsxmrU6I6Rd/7DQBxD0Vhc+CGAyk41f7AwU4Fwq105kpSLupTtU+NzT
ZpdvSXeirU8sxt+3WDOgv8esyYGZVVD8GuBbofMZQZ0vLq+FcDpYAW4w3LKvvi6X
C+WN8f7SCI0kqpd4leyl6EG3SoQKFyPWobu0IlV520R9b76iBcyqooTIpvVa4Yk7
az6fKhi9gK/T6FHW78y6fnkczd47JKrC924m+g6P/hhTD5zQ956ferp0uFTjkPtF
8kNgRT7OIRi1JO783cnQx1uA61axC3GX1HFzsD9paVkTwRNtJ3eFsxtcYt7Nfv8D
WGNWRQp5LmBsD/SqFBfk
=MzOJ
-----END PGP SIGNATURE-----
Merge tag 'nfc-next-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-next
Samuel Ortiz says:
====================
NFC: 4.1 pull request
This is the NFC pull request for 4.1.
This is a shorter one than usual, as the Intel Field Peak NFC
driver could not make it in time.
We have:
- A new driver for NXP NCI based chipsets, like e.g. the NPC100 or
the PN7150. It currently only supports an i2c physical layer, but
could easily be extended to work on top of e.g. SPI.
This driver also includes support for user space triggered firmware
updates.
- A few minor st21nfc[ab] fixes, cleanups, and comments improvements.
- A pn533 error return fix.
- A few NFC related logs formatting cleanups.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commits:
d92916f71a ("sfc: Own header for nic-specific sriov functions,")
25672dba95 ("sfc: Enable VF's via a write to the sysfs file
sriov_numvfs")
As they break the build with SRIOV disabled and there is no
easy way to fix it the way things are arranged.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
const __read_mostly is a senseless combination. If something
is already const it cannot be __read_mostly. Remove the bogus
__read_mostly in the fou driver.
This fixes section conflicts with LTO.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Alessio Igor Bogani <alessio.bogani@elettra.eu>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
More recent GCC warns about two kinds of switch statement uses:
1) Switching on an enumeration, but not having an explicit case
statement for all members of the enumeration. To show the
compiler this is intentional, we simply add a default case
with nothing more than a break statement.
2) Switching on a boolean value. I think this warning is dumb
but nevertheless you get it wholesale with -Wswitch.
This patch cures all such warnings in netfilter.
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Nicolas Dichtel says:
====================
selinux: add some missing nlmsg commands
It's not a critical issue, thus the patches are based on net-next.
Patches are splitted because the 'Fixes' tag is not the same for all
commands.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
These commands are missing.
Fixes: 28d8909bc7 ("[XFRM]: Export SAD info.")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This command is missing.
Fixes: ecfd6b1837 ("[XFRM]: Export SPD info")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This new command is missing.
Fixes: 880a6fab8f ("xfrm: configure policy hash table thresholds by netlink")
Reported-by: Christophe Gouault <christophe.gouault@6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This new command is missing.
Fixes: 9a9634545c ("netns: notify netns id events")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
These new commands are missing.
Fixes: 0c7aecd4bd ("netns: add rtnl cmd to add and get peer netns ids")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sowmini Varadhan says:
====================
RDS: RDS-core fixes
This patch-series updates the RDS core and rds-tcp modules with
some bug fixes that were originally authored by Andy Grover,
Zach Brown, and Chris Mason.
v2: Code review comment by Sergei Shtylov
V3: DaveM comments:
- dropped patches 3, 5 for "heuristic" changes in rds_send_xmit().
Investigation into the root-cause of these IB-triggered changes
produced the feedback: "I don't remember seeing "RDS: Stuck RM"
message in last 1-1.5 years and checking with other folks. It may very
well be some old workaround for stale connection for which long term
fix is already made and this part of code not exercised anymore."
Any such fixes, *if* they are needed, can/should be done in the
IB specific RDS transport modules.
- similarly dropped the LL_SEND_FULL patch (patch 6 in v2 set)
v4: Documentation/networking/rds.txt contains incorrect references
to "missing sysctl values for pf_rds and sol_rds in mainline".
The sysctl values were never needed in mainline, thus fix the
documentation.
v5: Clarify comment per http://www.spinics.net/lists/netdev/msg324220.html
v6: Re-added entire version history to cover letter.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If a determined set of concurrent senders keep the send queue full,
we can loop forever inside rds_send_xmit. This fix has two parts.
First we are dropping out of the while(1) loop after we've processed a
large batch of messages.
Second we add a generation number that gets bumped each time the
xmit bit lock is acquired. If someone else has jumped in and
made progress in the queue, we skip our goto restart.
Original patch by Chris Mason.
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Passive connections were added for the case where one loopback IB
connection between identical addresses needs another connection to store
the second QP. Unfortunately, they were also created in the case where
the addesses differ and we already have both QPs.
This lead to a message reordering bug.
- two different IB interfaces and addresses on a machine: A B
- traffic is sent from A to B
- connection from A-B is created, connect request sent
- listening accepts connect request, B-A is created
- traffic flows, next_rx is incremented
- unacked messages exist on the retrans list
- connection A-B is shut down, new connect request sent
- listen sees existing loopback B-A, creates new passive B-A
- retrans messages are sent and delivered because of 0 next_rx
The problem is that the second connection request saw the previously
existing parent connection. Instead of using it, and using the existing
next_rx_seq state for the traffic between those IPs, it mistakenly
thought that it had to create a passive connection.
We fix this by only using passive connections in the special case where
laddr and faddr match. In this case we'll only ever have one parent
sending connection requests and one passive connection created as the
listening path sees the existing parent connection which initiated the
request.
Original patch by Zach Brown
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
AF_RDS, PF_RDS and SOL_RDS are available in header files,
and there is no need to get their values from /proc. Document
this correctly.
Fixes: 0c5f9b8830 ("RDS: Documentation")
Signed-off-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The DWMAC block on certain SoCs (such as IMG Pistachio) have a second
clock which must be enabled in order to access the peripheral's
register interface, so add support for requesting and enabling an
optional "pclk".
Signed-off-by: Andrew Bresticker <abrestic@chromium.org>
Cc: James Hartley <james.hartley@imgtec.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit 79b16aadea
("udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb()")
introduce 'sk' but we already have one inner 'sk'.
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vitaly Kuznetsov says:
====================
hv_netvsc: linearize SKBs bigger than MAX_PAGE_BUFFER_COUNT-2 pages
This patch series fixes the same issue which was fixed in Xen with commit
97a6d1bb2b ("xen-netfront: Fix handling packets on
compound pages with skb_linearize").
It is relatively easy to create a packet which is small in size but occupies
more than 30 (MAX_PAGE_BUFFER_COUNT-2) pages. Here is a kernel-mode reproducer
which tries sending a packet with only 34 bytes of payload (but on 34 pages)
and fails:
static int __init sendfb_init(void)
{
struct socket *sock;
int i, ret;
struct sockaddr_in in4_addr = { 0 };
struct page *pages[17];
unsigned long flags;
ret = sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
if (ret) {
pr_err("failed to create socket: %d!\n", ret);
return ret;
}
in4_addr.sin_family = AF_INET;
/* www.google.com, 74.125.133.99 */
in4_addr.sin_addr.s_addr = cpu_to_be32(0x4a7d8563);
in4_addr.sin_port = cpu_to_be16(80);
ret = sock->ops->connect(sock, (struct sockaddr *)&in4_addr, sizeof(in4_addr), 0);
if (ret) {
pr_err("failed to connect: %d!\n", ret);
return ret;
}
/* We can send up to 17 frags */
flags = MSG_MORE;
for (i = 0; i < 17; i++) {
if (i == 16)
flags = MSG_EOR;
pages[i] = alloc_pages(GFP_KERNEL | __GFP_COMP, 1);
if (!pages[i]) {
pr_err("out of memory!");
goto free_pages;
}
sock->ops->sendpage(sock, pages[i], PAGE_SIZE -1, 2, flags);
}
free_pages:
for (; i > 0; i--)
__free_pages(pages[i - 1], 1);
printk("sendfb_init: test done\n");
return -1;
}
module_init(sendfb_init);
MODULE_LICENSE("GPL");
A try to load such module results in multiple
'kernel: hv_netvsc vmbus_15 eth0: Packet too big: 100' messages as all retries
fail as well. It should also be possible to trigger the issue from userspace, I
expect e.g. NFS under heavy load to get stuck sometimes.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In netvsc_start_xmit() we can handle packets which are scattered around not
more than MAX_PAGE_BUFFER_COUNT-2 pages. It is, however, easy to create a
packet which is not big in size but occupies more pages (e.g. if it uses frags
on compound pages boundaries). When we drop such packet it cases sender to try
resending it but in most cases it will try resending the same packet which will
also get dropped, this will cause the particular connection to stick. To solve
the issue we can try linearizing skb.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
... which validly uses dev_kfree_skb_any() instead of dev_kfree_skb().
Setting ret to -EFAULT and -ENOMEM have no real meaning here (we need to set
it to anything but -EAGAIN) as we drop the packet and return NETDEV_TX_OK
anyway.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Shradha Shah says:
====================
sfc: Nic specific sriov functions, netdev_ops and sriov_configure
First two patches among the series of patches to support SRIOV on EF10.
First patch declares nic specific sriov functions in nic specific headers,
creates only one instance of the netdev_ops, removes sriov functionality
from Falcon code.
Second patch adds support for sriov_configure.
The Virtual Functions can be enabled but they do not bind to the SFC
driver just yet.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the use of sriov_configure on EF10
to enable Virtual Functions while the driver is loaded.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
By putting all the efx_{siena,ef10}_sriov_* declarations in
{siena,ef10}_sriov.h, ensure they cannot be called from nic-generic code.
Also fixes up an instance of this, where mcdi.c was calling
efx_siena_sriov_flr.
The single instance of netdev_ops should call general high level
functions that can then call something adapter specific in efx_nic_type.
We should only do adapter specialisation via efx_nic_type.
Removal of sriov functionality from the Falcon code means that tests
are needed for the presence of some callbacks.
Signed-off-by: Shradha Shah <sshah@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Alexander Duyck says:
====================
Replace wmb()/rmb() with dma_wmb()/dma_rmb() where appropriate
This is a start of a side project cleaning up the drivers that can make use
of the dma_wmb and dma_rmb calls. The general idea is to start removing
the unnecessary wmb/rmb calls from a number of drivers and to make use of
the lighter weight dma_wmb/dma_rmb calls as this should allow for an
overall improvement in performance as each barrier can cost a significant
number of cycles and on architectures such as x86 this is unnecessary.
These changes are what I would consider low hanging fruit. The likelihood
of the changes introducing an error should be low since the use of the
barriers in these cases are fairly obvious.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This change replaces calls to rmb with dma_rmb in the case where we want to
order all follow-on descriptor reads after the check for the descriptor
status bit.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This change updates several spots where a wmb was being used to instead use
a dma_wmb to flush out writes before updating the control portion of the
descriptor.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch goes through and replaces wmb/rmb with dma_wmb/dma_rmb in cases
where the barrier is being used to order writes or reads to just memory and
doesn't involve any programmed I/O.
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
bond_3ad_bind_slave() calls ad_initialize_port() and then immediately
assigns correct values making some of that initialization unnecessary.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch breaks the rich assignments into it's own statements
and removes some duplicate code where admin-key, & oper-key are
updated.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: 79b16aadea ("udp_tunnel: Pass UDP socket down through udp_tunnel{, 6}_xmit_skb().")
Reported-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The socket parameter might legally be NULL, thus sock_net is sometimes
causing a NULL pointer dereference. Using net_device pointer in dst_entry
is more reliable.
Fixes: b6a7719aed ("ipv4: hash net ptr into fragmentation bucket selection")
Reported-by: Rick Jones <rick.jones2@hp.com>
Cc: Rick Jones <rick.jones2@hp.com>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an userdata set extension and allow the user to attach arbitrary
data to set elements. This is intended to hold TLV encoded data like
comments or DNS annotations that have no meaning to the kernel.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add a new "dynset" expression for dynamic set updates.
A new set op ->update() is added which, for non existant elements,
invokes an initialization callback and inserts the new element.
For both new or existing elements the extenstion pointer is returned
to the caller to optionally perform timer updates or other actions.
Element removal is not supported so far, however that seems to be a
rather exotic need and can be added later on.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Currently a set binding is assumed to be related to a lookup and, in
case of maps, a data load.
In order to use bindings for set updates, the loop detection checks
must be restricted to map operations only. Add a flags member to the
binding struct to hold the set "action" flags such as NFT_SET_MAP,
and perform loop detection based on these.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>