The code seems to assume a null is returned when the list is empty
from first_sec_slave() to break the loop which is incorrect. Fix the
code by using list_empty().
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently if user do rmmod keystone_netcp.ko following warning is
seen :-
[ 59.035891] ------------[ cut here ]------------
[ 59.040535] WARNING: CPU: 2 PID: 1619 at drivers/net/ethernet/ti/
netcp_core.c:2127 netcp_remove)
This is because the interface list is not cleaned up in netcp_remove.
This patch fixes this. Also fix some checkpatch related warnings.
Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since mdb states were introduced when deleting an entry the state was
left as it was set in the delete request from the user which leads to
the following output when doing a monitor (for example):
$ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent
$ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 temp
^^^
Note the "temp" state in the delete notification which is wrong since
the entry was permanent, the state in a delete is always reported as
"temp" regardless of the real state of the entry.
After this patch:
$ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent
$ bridge mdb del dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent
There's one important note to make here that the state is actually not
matched when doing a delete, so one can delete a permanent entry by
stating "temp" in the end of the command, I've chosen this fix in order
not to break user-space tools which rely on this (incorrect) behaviour.
So to give an example after this patch and using the wrong state:
$ bridge mdb add dev br0 port eth3 grp 239.0.0.1 permanent
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent
$ bridge mdb del dev br0 port eth3 grp 239.0.0.1 temp
(monitor) dev br0 port eth3 grp 239.0.0.1 permanent
Note the state of the entry that got deleted is correct in the
notification.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: ccb1c31a7a ("bridge: add flags to distinguish permanent mdb entires")
Signed-off-by: David S. Miller <davem@davemloft.net>
When fast leave is configured on a bridge port and an IGMP leave is
received for a group, the group is not deleted immediately if there is
a router detected or if multicast querier is configured.
Ideally the group should be deleted immediately when fast leave is
configured.
Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
There are several devices that can receive vlan tagged packets with
CHECKSUM_PARTIAL like tap, possibly veth and xennet.
When (multiple) vlan tagged packets with CHECKSUM_PARTIAL are forwarded
by bridge to a device with the IP_CSUM feature, they end up with checksum
error because before entering bridge, the network header is set to
ETH_HLEN (not including vlan header length) in __netif_receive_skb_core(),
get_rps_cpu(), or drivers' rx functions, and nobody fixes the pointer later.
Since the network header is exepected to be ETH_HLEN in flow-dissection
and hash-calculation in RPS in rx path, and since the header pointer fix
is needed only in tx path, set the appropriate network header on forwarding
packets.
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
tpacket_fill_skb() can return a negative value (-errno) which
is stored in tp_len variable. In that case the following
condition will be (but shouldn't be) true:
tp_len > dev->mtu + dev->hard_header_len
as dev->mtu and dev->hard_header_len are both unsigned.
That may lead to just returning an incorrect EMSGSIZE errno
to the user.
Fixes: 52f1454f62 ("packet: allow to transmit +4 byte in TX_RING slot for VLAN case")
Signed-off-by: Alexander Drozdov <al.drozdov@gmail.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
When arp is off on a device, and ioctl(SIOCGARP) is queried,
a buggy answer is given with MAC address of the device, instead
of the mac address of the destination/gateway.
We filter out NUD_NOARP neighbours for /proc/net/arp,
we must do the same for SIOCGARP ioctl.
Tested:
lpaa23:~# ./arp 10.246.7.190
MAC=00:01:e8:22:cb:1d // correct answer
lpaa23:~# ip link set dev eth0 arp off
lpaa23:~# cat /proc/net/arp # check arp table is now 'empty'
IP address HW type Flags HW address Mask Device
lpaa23:~# ./arp 10.246.7.190
MAC=00:1a:11:c3:0d:7f // buggy answer before patch (this is eth0 mac)
After patch :
lpaa23:~# ip link set dev eth0 arp off
lpaa23:~# ./arp 10.246.7.190
ioctl(SIOCGARP) failed: No such device or address
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Vytautas Valancius <valas@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This notification causes the FIB to be updated, which is not needed
because the address already exists, and more importantly it may undo
intentional changes that were made to the FIB after the address was
originally added. (As a point of comparison, when an address becomes
deprecated because its preferred lifetime expired, a notification on
this chain is not generated.)
The motivation for this commit is fixing an incompatibility between
DHCP clients which set and update the address lifetime according to
the lease, and a commercial VPN client which replaces kernel routes
in a way that outbound traffic is sent only through the tunnel (and
disconnects if any further route changes are detected via netlink).
Signed-off-by: David Ward <david.ward@ll.mit.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
These should be handled only by the respective STP which is in control.
They become problematic for devices with limited resources with many
ports because the hold_timer is per port and fires each second and the
hello timer fires each 2 seconds even though it's global. While in
user-space STP mode these timers are completely unnecessary so it's better
to keep them off.
Also ensure that when the bridge is up these timers are started only when
running with kernel STP.
Signed-off-by: Satish Ashok <sashok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When binding a PF_PACKET socket, the use count of the bound interface is
always increased with dev_hold in dev_get_by_{index,name}. However,
when rebound with the same protocol and device as in the previous bind
the use count of the interface was not decreased. Ultimately, this
caused the deletion of the interface to fail with the following message:
unregister_netdevice: waiting for dummy0 to become free. Usage count = 1
This patch moves the dev_put out of the conditional part that was only
executed when either the protocol or device changed on a bind.
Fixes: 902fefb82e ('packet: improve socket create/bind latency in some cases')
Signed-off-by: Lars Westerhoff <lars.westerhoff@newtec.eu>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Network header is set with offset ETH_HLEN but it is not true for VLAN
(multiple-)tagged and results in checksum issues in lower devices.
v2: leave skb->protocol untouched (thx Vlad), comment added
v3: moved after skb_probe_transport_header() call (thx Toshiaki)
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
It was reported that update_suffix was taking a long time on systems where
a large number of leaves were attached to a single node. As it turns out
fib_table_flush was calling update_suffix for each leaf that didn't have all
of the aliases stripped from it. As a result, on this large node removing
one leaf would result in us calling update_suffix for every other leaf on
the node.
The fix is to just remove the calls to leaf_pull_suffix since they are
redundant as we already have a call in resize that will go through and
update the suffix length for the node before we exit out of
fib_table_flush or fib_table_flush_external.
Reported-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Tested-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
If an architecture defines readl/writel using CPP macros, we
get the following kinds of build failure:
> > > drivers/net/ethernet/cadence/macb.c:164:1: error: macro "writel"
> > > passed 3 arguments, but takes just 2
> macb_or_gem_writel(bp, SA1B, bottom);
> ^
Rename the methods so that this doesn't happen.
Signed-off-by: David S. Miller <davem@davemloft.net>
When a switch is attached to the mdio bus, the mdio bus can be used
while the interface is not open. If the IPG clock is not enabled, MDIO
reads/writes will simply time out.
Add support for runtime PM to control this clock. Enable/disable this
clock using runtime PM, with open()/close() and mdio read()/write()
function triggering runtime PM operations. Since PM is optional, the
IPG clock is enabled at probe and is no longer modified by
fec_enet_clk_enable(), thus if PM is not enabled in the kernel, it is
guaranteed the clock is running when MDIO operations are performed.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Cc: tyler.baker@linaro.org
Cc: fabio.estevam@freescale.com
Cc: shawn.guo@linaro.org
Tested-by: Fabio Estevam <fabio.estevam@freescale.com>
Tested-by: Tyler Baker <tyler.baker@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch asserts SGMII RTRESET, i.e. resetting the SGMII Tx/Rx
logic, during network interface shutdown to avoid having the
hardware wedge when shutting down with high incoming traffic rates.
This is cleared (brought out of RTRESET) when the interface is
brought back up.
Signed-off-by: WingMan Kwok <w-kwok2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko says:
====================
net/macb: fix for AVR32 and clean up
It seems no one had tested recently the driver on AVR32 platforms such as
ATNGW100. This series bring it back to work.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch coverts struct description to the kernel doc format. There is no
functional change.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
macb_count_tx_descriptors() repeats the generic macro DIV_ROUND_UP(). The patch
does a replacement.
There is no functional change.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes the following warnings:
drivers/net/ethernet/cadence/macb.c: In function ‘macb_handle_link_change’:
drivers/net/ethernet/cadence/macb.c:266: warning: comparison between signed and unsigned
drivers/net/ethernet/cadence/macb.c:267: warning: comparison between signed and unsigned
drivers/net/ethernet/cadence/macb.c:291: warning: comparison between signed and unsigned
drivers/net/ethernet/cadence/macb.c: In function ‘gem_update_stats’:
drivers/net/ethernet/cadence/macb.c:1908: warning: comparison between signed and unsigned
drivers/net/ethernet/cadence/macb.c: In function ‘gem_get_ethtool_strings’:
drivers/net/ethernet/cadence/macb.c:1988: warning: comparison between signed and unsigned
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
To avoid messages like
macb macb.0 (unnamed net_device) (uninitialized): Cadence caps 0x00000000
macb macb.0 (unnamed net_device) (uninitialized): invalid hw address, using random
let's use dev_*() macros.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit 98b5a0f4a2 introduces jumbo frame support, but also it assumes
that macb_config present which is not always true.
The configuration without macb_config fails to boot.
Unable to handle kernel NULL pointer dereference at virtual address 00000010
ptbr = 90350000 pgd = 00000000
Oops: Kernel access of bad area, sig: 11 [#1]
FRAME_POINTER chip: 0x01f:0x1e82 rev 2
Modules linked in:
CPU: 0 PID: 1 Comm: swapper Not tainted 4.2.0-rc3-next-20150723+ #13
task: 91c26000 ti: 91c28000 task.ti: 91c28000
PC is at macb_probe+0x140/0x61c
Fixes: 98b5a0f4a2 (net: macb: Add support for jumbo frames)
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The commit a50dad355a (net: macb: Add big endian CPU support) converted I/O
accessors to readl_relaxed() and writel_relaxed() and consequentially broke
MACB driver on AVR32 platforms such as ATNGW100.
This patch improves I/O access by checking endiannes first and use the
corresponding methods.
Fixes: a50dad355a (net: macb: Add big endian CPU support)
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, tcp_recvmsg enters a busy loop in sk_wait_data if called
with flags = MSG_WAITALL | MSG_PEEK.
sk_wait_data waits for sk_receive_queue not empty, but in this case,
the receive queue is not empty, but does not contain any skb that we
can use.
Add a "last skb seen on receive queue" argument to sk_wait_data, so
that it sleeps until the receive queue has new skbs.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=99461
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=18493
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1205258
Reported-by: Enrico Scholz <rh-bugzilla@ensc.de>
Reported-by: Dan Searle <dan@censornet.com>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hayes Wang says:
====================
r8152: issues fix
v2:
Replace patch #2 with "r8152: fix wakeup settings".
v1:
These patches are used to fix issues.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Adjust napi_disable() and napi_enable() to avoid r8152_poll() start
working before rx ready. Otherwise, it may have race condition for
rx_agg.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Avoid the driver to enable WOL if the device doesn't support it.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
- Disable U1/U2 during initialization.
- Disable lpm when linking is on, and enable it when linking is off.
- Disable U1/U2 when enabling runtime suspend.
It is possible to let hw stop working, if the U1/U2 request occurs
during some situations. The patch is used to avoid it.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johan Hedberg says:
====================
pull request: bluetooth 2015-07-23
Here's another one-patch pull request for 4.2 which targets a potential
NULL pointer dereference in the LE Security Manager code that can be
triggered by using older user space tools. The issue has been there
since 4.0 so there's the appropriate "Cc: stable" in place.
Let me know if there are any issues pulling. Thanks.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This function frees resources and cancels delayed work item that
have been initialized in fec_ptp_init().
Use this to do proper error handling if something goes wrong in
probe function after fec_ptp_init has been called.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Acked-by: Fugang Duan <B38611@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
So it gets freed when the device is going away.
This fixes a DMA memory leak on driver probe() fail and driver
remove().
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes: a3138df9 ("[NIU]: Add Sun Neptune ethernet driver.")
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Florian Westphal says:
====================
inet: ip defrag bug fixes
Johan Schuijt and Frank Schreuder reported crash and softlockup after the
inet workqueue eviction change:
general protection fault: 0000 [#1] SMP
CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted 3.18.18-transip-1.5 #1
Workqueue: events inet_frag_worker
task: ffff880224935130 ti: ffff880224938000 task.ti: ffff880224938000
RIP: 0010:[<ffffffff8149288c>] [<ffffffff8149288c>] inet_evict_bucket+0xfc/0x160
RSP: 0018:ffff88022493bd58 EFLAGS: 00010286
RAX: ffff88021f4f3e80 RBX: dead000000100100 RCX: 000000000000006b
RDX: 000000000000006c RSI: ffff88021f4f3e80 RDI: dead0000001000a8
RBP: 0000000000000002 R08: ffff880222273900 R09: ffff880036e49200
R10: ffff8800c6e86500 R11: ffff880036f45500 R12: ffffffff81a87100
R13: ffff88022493bd70 R14: 0000000000000000 R15: ffff8800c9b26280
[..]
Call Trace:
[<ffffffff814929e0>] ? inet_frag_worker+0x60/0x210
[<ffffffff8107e3a2>] ? process_one_work+0x142/0x3b0
[<ffffffff8107eb94>] ? worker_thread+0x114/0x440
[..]
A second issue results in softlockup since the evictor may restart the
eviction loop for a (potentially) unlimited number of times while local
softirqs are disabled.
Frank reports that test system remained stable for 14 hours of testing
(before, crash occured within half an hour in their setup).
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
We can simply remove the INET_FRAG_EVICTED flag to avoid all the flags
race conditions with the evictor and use a participation test for the
evictor list, when we're at that point (after inet_frag_kill) in the
timer there're 2 possible cases:
1. The evictor added the entry to its evictor list while the timer was
waiting for the chainlock
or
2. The timer unchained the entry and the evictor won't see it
In both cases we should be able to see list_evictor correctly due
to the sync on the chainlock.
Joint work with Florian Westphal.
Tested-by: Frank Schreuder <fschreuder@transip.nl>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Frank reports 'NMI watchdog: BUG: soft lockup' errors when
load is high. Instead of (potentially) unbounded restarts of the
eviction process, just skip to the next entry.
One caveat is that, when a netns is exiting, a timer may still be running
by the time inet_evict_bucket returns.
We use the frag memory accounting to wait for outstanding timers,
so that when we free the percpu counter we can be sure no running
timer will trip over it.
Reported-and-tested-by: Frank Schreuder <fschreuder@transip.nl>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Followup patch will call it after inet_frag_queue was freed, so q->net
doesn't work anymore (but netf = q->net; free(q); mem_limit(netf) would).
Tested-by: Frank Schreuder <fschreuder@transip.nl>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 65ba1f1ec0 ("inet: frags: fix a race between inet_evict_bucket
and inet_frag_kill") describes the bug, but the fix doesn't work reliably.
Problem is that ->flags member can be set on other cpu without chainlock
being held by that task, i.e. the RMW-Cycle can clear INET_FRAG_EVICTED
bit after we put the element on the evictor private list.
We can crash when walking the 'private' evictor list since an element can
be deleted from list underneath the evictor.
Join work with Nikolay Alexandrov.
Fixes: b13d3cbfb8 ("inet: frag: move eviction of queues to work queue")
Reported-by: Johan Schuijt <johan@transip.nl>
Tested-by: Frank Schreuder <fschreuder@transip.nl>
Signed-off-by: Nikolay Alexandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Back then when we added support for SCTP_SNDINFO/SCTP_RCVINFO from
RFC6458 5.3.4/5.3.5, we decided to add a deprecation warning for the
(as per RFC deprecated) SCTP_SNDRCV via commit bbbea41d5e ("net:
sctp: deprecate rfc6458, 5.3.2. SCTP_SNDRCV support"), see [1].
Imho, it was not a good idea, and we should just revert that message
for a couple of reasons:
1) It's uapi and therefore set in stone forever.
2) To be able to run on older and newer kernels, an SCTP application
would need to probe for both, SCTP_SNDRCV, but also SCTP_SNDINFO/
SCTP_RCVINFO support, so that on older kernels, it can make use
of SCTP_SNDRCV, and on newer kernels SCTP_SNDINFO/SCTP_RCVINFO.
In my (limited) experience, a lot of SCTP appliances are migrating
to newer kernels only ve(ee)ry slowly.
3) Some people don't have the chance to change their applications,
f.e. due to proprietary legacy stuff. So, they'll hit this warning
in fast path and are stuck with older kernels.
But i.e. due to point 1) I really fail to see the benefit of a warning.
So just revert that for now, the issue was reported up Jamal.
[1] http://thread.gmane.org/gmane.linux.network/321960/
Reported-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Michael Tuexen <tuexen@fh-muenster.de>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Or Gerlitz says:
====================
mlx4 driver fixes, July 22nd 2015
Just few mlx4 fixes.. the fix related to propagating port state
changes to VF should go to -stable of >= 3.11, all the rest just
to 4.2-rc
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
In mlx4_en_is_ring_empty we check if ring surpassed its size.
Since the prod and cons indicators are u32, there might be a state where
prod wrapped around and cons, making this assert false, although no
actual bug exists (other code segment can cope with this state).
Signed-off-by: Ido Shamay <idos@mellanox.com>
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a port is not attached, the FW requires a longer than usual time to
execute the SENSE_PORT command. In the command flow, the
wait_for_completion_timeout call used in mlx4_cmd_wait puts the kernel
thread into the uninterruptible state during this time. This, in turn,
due to the computation method, causes the CPU load average to increase.
Fix this by using wait_for_completion_interruptible_timeout() for the
SENSE_PORT command, which puts the thread in the interruptible state.
In this state, the thread does not contribute to the CPU load average.
Treat the interrupted case as if the SENSE_PORT command returned
port_type = NONE.
Fix suggested by Gideon Naim <gideonn@mellanox.com> and
Bart Van Assche <bart.vanassche@sandisk.com>.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The port-change event processing in procedure mlx4_eq_int() uses "slave"
as the vf_oper array index. Since the value of "slave" is the PF function
index, the result is that the PF link state is used for deciding to
propagate the event for all the VFs. The VF link state should be used,
so the VF function index should be used here.
Fixes: 948e306d7d ('net/mlx4: Add VF link state support')
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some old PF drivers don't let VFs allocate counters, in that case, use
the sink counter so the VF can load and operate properly.
Fixes: 6de5f7f6a1 ('net/mlx4_core: Allocate default counter per port')
Reported-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since slave_changelink support was added there have been a few race
conditions when using br_setport() since some of the port functions it
uses require the bridge lock. It is very easy to trigger a lockup due to
some internal spin_lock() usage without bh disabled, also it's possible to
get the bridge into an inconsistent state.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Fixes: 3ac636b859 ("bridge: implement rtnl_link_ops->slave_changelink")
Reviewed-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter/IPVS fixes for net
The following patchset contains ten Netfilter/IPVS fixes, they are:
1) Address refcount leak when creating an expectation from the ctnetlink
interface.
2) Fix bug splat in the IDLETIMER target related to sysfs, from Dmitry
Torokhov.
3) Resolve panic for unreachable route in IPVS with locally generated
traffic in the output path, from Alex Gartrell.
4) Fix wrong source address in rare cases for tunneled traffic in IPVS,
from Julian Anastasov.
5) Fix crash if scheduler is changed via ipvsadm -E, again from Julian.
6) Make sure skb->sk is unset for forwarded traffic through IPVS, again from
Alex Gartrell.
7) Fix crash with IPVS sync protocol v0 and FTP, from Julian.
8) Reset sender cpu for forwarded traffic in IPVS, also from Julian.
9) Allocate template conntracks through kmalloc() to resolve netns dependency
problems with the conntrack kmem_cache.
10) Fix zones with expectations that clash using the same tuple, from Joe
Stringer.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Otherwise the skbuff related structures are not correctly
refcount'ed.
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Julian Anastasov says:
====================
ipv4: fib_select_default changes
This patchset contains 2 changes for the alternative routes,
one to add tb_id/fa_slen check needed after the recent
fib_trie optimizations for fib aliases and the second
change attempts to support alternative routes with TOS
requirement.
Sorry that I don't have access to the original
report from Hagen Paul Pfeifer. I hope he will see this
change.
The second change adds fa_default field to the
fib aliases (which can be many) and if the feature to
filter the alternative routes by TOS is not worth it,
this second patch can be scrapped.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
fib_select_default considers alternative routes only when
res->fi is for the first alias in res->fa_head. In the
common case this can happen only when the initial lookup
matches the first alias with highest TOS value. This
prevents the alternative routes to require specific TOS.
This patch solves the problem as follows:
- routes that require specific TOS should be returned by
fib_select_default only when TOS matches, as already done
in fib_table_lookup. This rule implies that depending on the
TOS we can have many different lists of alternative gateways
and we have to keep the last used gateway (fa_default) in first
alias for the TOS instead of using single tb_default value.
- as the aliases are ordered by many keys (TOS desc,
fib_priority asc), we restrict the possible results to
routes with matching TOS and lowest metric (fib_priority)
and routes that match any TOS, again with lowest metric.
For example, packet with TOS 8 can not use gw3 (not lowest
metric), gw4 (different TOS) and gw6 (not lowest metric),
all other gateways can be used:
tos 8 via gw1 metric 2 <--- res->fa_head and res->fi
tos 8 via gw2 metric 2
tos 8 via gw3 metric 3
tos 4 via gw4
tos 0 via gw5
tos 0 via gw6 metric 1
Reported-by: Hagen Paul Pfeifer <hagen@jauu.net>
Signed-off-by: Julian Anastasov <ja@ssi.bg>
Signed-off-by: David S. Miller <davem@davemloft.net>