Commit Graph

678559 Commits

Author SHA1 Message Date
Jacob Keller
6964e53f55 i40e: fix handling of HW ATR eviction
A recent commit to refactor the driver and remove the hw_disabled_flags
field accidentally introduced two regressions. First, we overwrote
pf->flags which removed various key flags including the MSI-X settings.

Additionally, it was intended that we have now two flags,
HW_ATR_EVICT_CAPABLE and HW_ATR_EVICT_ENABLED, but this was not done,
and we accidentally were mis-using HW_ATR_EVICT_CAPABLE everywhere.

This patch adds the missing piece, HW_ATR_EVICT_ENABLED, and safely
updates pf->flags instead of overwriting it.

Without this patch we will have many problems including disabling MSI-X
support, and we'll attempt to use HW ATR eviction on devices which do
not support it.

Fixes: 47994c119a ("i40e: remove hw_disabled_flags in favor of using separate flag bits", 2017-04-19)
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 18:53:02 -04:00
Karicheri, Muralidharan
675c8da049 hsr: fix incorrect warning
When HSR interface is setup using ip link command, an annoying warning
appears with the trace as below:-

[  203.019828] hsr_get_node: Non-HSR frame
[  203.019833] Modules linked in:
[  203.019848] CPU: 0 PID: 158 Comm: sd-resolve Tainted: G        W       4.12.0-rc3-00052-g9fa6bf70 #2
[  203.019853] Hardware name: Generic DRA74X (Flattened Device Tree)
[  203.019869] [<c0110280>] (unwind_backtrace) from [<c010c2f4>] (show_stack+0x10/0x14)
[  203.019880] [<c010c2f4>] (show_stack) from [<c04b9f64>] (dump_stack+0xac/0xe0)
[  203.019894] [<c04b9f64>] (dump_stack) from [<c01374e8>] (__warn+0xd8/0x104)
[  203.019907] [<c01374e8>] (__warn) from [<c0137548>] (warn_slowpath_fmt+0x34/0x44)
root@am57xx-evm:~# [  203.019921] [<c0137548>] (warn_slowpath_fmt) from [<c081126c>] (hsr_get_node+0x148/0x170)
[  203.019932] [<c081126c>] (hsr_get_node) from [<c0814240>] (hsr_forward_skb+0x110/0x7c0)
[  203.019942] [<c0814240>] (hsr_forward_skb) from [<c0811d64>] (hsr_dev_xmit+0x2c/0x34)
[  203.019954] [<c0811d64>] (hsr_dev_xmit) from [<c06c0828>] (dev_hard_start_xmit+0xc4/0x3bc)
[  203.019963] [<c06c0828>] (dev_hard_start_xmit) from [<c06c13d8>] (__dev_queue_xmit+0x7c4/0x98c)
[  203.019974] [<c06c13d8>] (__dev_queue_xmit) from [<c0782f54>] (ip6_finish_output2+0x330/0xc1c)
[  203.019983] [<c0782f54>] (ip6_finish_output2) from [<c0788f0c>] (ip6_output+0x58/0x454)
[  203.019994] [<c0788f0c>] (ip6_output) from [<c07b16cc>] (mld_sendpack+0x420/0x744)

As this is an expected path to hsr_get_node() with frame coming from
the master interface, add a check to ensure packet is not from the
master port and then warn.

Signed-off-by: Murali Karicheri <m-karicheri2@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 15:21:20 -04:00
Christian Perle
3500cd73df proc: snmp6: Use correct type in memset
Reading /proc/net/snmp6 yields bogus values on 32 bit kernels.
Use "u64" instead of "unsigned long" in sizeof().

Fixes: 4a4857b1c8 ("proc: Reduce cache miss in snmp6_seq_show")
Signed-off-by: Christian Perle <christian.perle@secunet.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-12 09:53:14 -04:00
Donald Sharp
4b1f0d33db net: ipmr: Fix some mroute forwarding issues in vrf's
This patch fixes two issues:

1) When forwarding on *,G mroutes that are in a vrf, the
kernel was dropping information about the actual incoming
interface when calling ip_mr_forward from ip_mr_input.
This caused ip_mr_forward to send the multicast packet
back out the incoming interface.  Fix this by
modifying ip_mr_forward to be handed the correctly
resolved dev.

2) When a unresolved cache entry is created we store
the incoming skb on the unresolved cache entry and
upon mroute resolution from the user space daemon,
we attempt to forward the packet.  Again we were
not resolving to the correct incoming device for
a vrf scenario, before calling ip_mr_forward.
Fix this by resolving to the correct interface
and calling ip_mr_forward with the result.

Fixes: e58e415968 ("net: Enable support for VRF with ipv4 multicast")
Signed-off-by: Donald Sharp <sharpd@cumulusnetworks.com>
Acked-by: David Ahern <dsahern@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Reviewed-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 18:15:06 -04:00
David S. Miller
062bb997d2 mlx5-fixes-2017-06-11
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJZPRegAAoJEEg/ir3gV/o+UGwH/RoGd4Z8dGEkHJliBoVHTQ8d
 X0iT3Pk3ho9Y8EQzLauhX9HRDTGvWIFo8uEIfe8uXsUVy9Dyb0UJ5fjYW1UQrNo1
 6p08aG6I3IxYiKjfw6a/Csxc5hlxMS5g70jWbuZmTQZwAOagzvS14T9I5JFMmgcH
 2fI4/cr4a0GHm4NMCObiTtuNC1CWM1llJ+zttUpcxFbpDkMPcD3SOzmmcz9eZx9c
 18GaBdTOXcDkmQKIdixuxjlrQp2og2/eeHDMV8PSxjmo3yURhK97qc844sdbxCKP
 wcHXwCUWXtluLBQ/6eU4LXgKcDn1aYHX38sj8YgstIc0JRG3uFCjIq+zPylBNNI=
 =tb/u
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2017-06-11' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox mlx5 fixes 2017-06-11

This series contains some fixes for the mlx5 core and netdev driver.

Please pull and let me know if there's any problem.

For -stable:
("net/mlx5e: Added BW check for DIM decision mechanism")              kernels >= 4.9
("net/mlx5e: Fix wrong indications in DIM due to counter wraparound") kernels >= 4.9
("net/mlx5: Remove several module events out of ethtool stats")       kernels >= 4.10
("net/mlx5: Enable 4K UAR only when page size is bigger than 4K")     kernels >= 4.11

*all patches apply with no issue on their -stable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:40:52 -04:00
David S. Miller
77a6bb5ac0 Merge branch 'ena-fixes'
Netanel Belgazal says:

====================
Bugs fixes in ena ethernet driver

This patchset contains fixes for the bugs that were discovered so far.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:48 -04:00
Netanel Belgazal
e7ff7efae5 net: ena: update ena driver to version 1.1.7
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:47 -04:00
Netanel Belgazal
800c55cb76 net: ena: bug fix in lost tx packets detection mechanism
check_for_missing_tx_completions() is called from a timer
task and looking for lost tx packets.
The old implementation accumulate all the lost tx packets
and did not check if those packets were retrieved on a later stage.
This cause to a situation where the driver reset
the device for no reason.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:47 -04:00
Netanel Belgazal
a2cc5198da net: ena: disable admin msix while working in polling mode
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:47 -04:00
Netanel Belgazal
a3af7c18cf net: ena: fix theoretical Rx hang on low memory systems
For the rare case where the device runs out of free rx buffer
descriptors (in case of pressure on kernel  memory),
and the napi handler continuously fail to refill new Rx descriptors
until device rx queue totally runs out of all free rx buffers
to post incoming packet, leading to a deadlock:
* The device won't send interrupts since all the new
Rx packets will be dropped.
* The napi handler won't try to allocate new Rx descriptors
since allocation is part of NAPI that's not being invoked any more

The fix involves detecting this scenario and rescheduling NAPI
(to refill buffers) by the keepalive/watchdog task.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
0857d92f71 net: ena: add missing unmap bars on device removal
This patch also change the mapping functions to devm_ functions

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
661d2b0cce net: ena: fix race condition between submit and completion admin command
Bug:
"Completion context is occupied" error printout will be noticed in
dmesg.
This error will cause the admin command to fail, which will lead to
an ena_probe() failure or a watchdog reset (depends on which admin
command failed).

Root cause:
__ena_com_submit_admin_cmd() is the function that submits new entries to
the admin queue.
The function have a check that makes sure the queue is not full and the
function does not override any outstanding command.
It uses head and tail indexes for this check.
The head is increased by ena_com_handle_admin_completion() which runs
from interrupt context, and the tail index is increased by the submit
function (the function is running under ->q_lock, so there is no risk
of multithread increment).
Each command is associated with a completion context. This context
allocated before call to __ena_com_submit_admin_cmd() and freed by
ena_com_wait_and_process_admin_cq_interrupts(), right after the command
was completed.

This can lead to a state where the head was increased, the check passed,
but the completion context is still in use.

Solution:
Use the atomic variable ->outstanding_cmds instead of using the head and
the tail indexes.
This variable is safe for use since it is bumped in get_comp_ctx() in
__ena_com_submit_admin_cmd() and is freed by comp_ctxt_release()

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
2d2c600a91 net: ena: add missing return when ena_com_get_io_handlers() fails
Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
418df30f7e net: ena: fix bug that might cause hang after consecutive open/close interface.
Fixing a bug that the driver does not unmask the IO interrupts
in ndo_open():
occasionally, the MSI-X interrupt (for one or more IO queues)
can be masked when ndo_close() was called.
If that is followed by ndo open(),
then the MSI-X will be still masked so no interrupt
will be received by the driver.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:46 -04:00
Netanel Belgazal
a77c1aafcc net: ena: fix rare uncompleted admin command false alarm
The current flow to detect admin completion is:
while (command_not_completed) {
	if (timeout)
		error

	check_for_completion()
		sleep()
   }
So in case the sleep took more than the timeout
(in case the thread/workqueue was not scheduled due to higher priority
task or prolonged VMexit), the driver can detect a stall even if
the completion is present.

The fix changes the order of this function to first check for
completion and only after that check if the timeout expired.

Fixes: 1738cd3ed3 ("Add a driver for Amazon Elastic Network Adapters (ENA)")
Signed-off-by: Netanel Belgazal <netanel@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-11 16:36:45 -04:00
Majd Dibbiny
91828bd899 net/mlx5: Enable 4K UAR only when page size is bigger than 4K
When the page size isn't bigger than 4K, there is no added value of enabling 4K
UAR feature in the Firmware.

Modified the condition of enabling the 4K UAR accordingly.

Fixes: f502d83495 ("net/mlx5: Activate support for 4K UARs")
Signed-off-by: Majd Dibbiny <majd@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Tal Gilboa
53acd76ce5 net/mlx5e: Fix wrong indications in DIM due to counter wraparound
DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Each iteration of the algorithm, DIM calculates the difference in
throughput, packet rate and interrupt rate from last iteration in order
to make a decision. DIM relies on counters for each metric. When these
counters get to their type's max value they wraparound. In this case
the delta between 'end' and 'start' samples is negative and when
translated to unsigned integers - very high. This results in a false
indication to the algorithm and might result in a wrong decision.

The fix calculates the 'distance' between 'end' and 'start' samples in a
cyclic way around the relevant type's max value. It can also be viewed as
an absolute value around the type's max value instead of around 0.

Testing show higher stability in DIM profile selection and no wraparound
issues.

Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Tal Gilboa
c3164d2fc4 net/mlx5e: Added BW check for DIM decision mechanism
DIM (Dynamically-tuned Interrupt Moderation) is a mechanism designed for
changing the channel interrupt moderation values in order to reduce CPU
overhead for all traffic types.
Until now only interrupt and packet rate were sampled.
We found a scenario on which we get a false indication since a change in
DIM caused more aggregation and reduced packet rate while increasing BW.

We now regard a change as succesfull iff:
current_BW > (prev_BW + threshold) or
current_BW ~= prev_BW and current_PR > (prev_PR + threshold) or
current_BW ~= prev_BW and current_PR ~= prev_PR and
    current_IR < (prev_IR - threshold)
Where BW = Bandwidth, PR = Packet rate and IR = Interrupt rate

Improvements (ConnectX-4Lx 25GbE, single RX queue, LRO off)
    --------------------------------------------------
    packet size | before[Mb/s] | after[Mb/s] | gain  |
    2B          | 343.4        | 359.4       |  4.5% |
    16B         | 2739.7       | 2814.8      |  2.7% |
    64B         | 9739         | 10185.3     |  4.5% |

Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Tal Gilboa <talgi@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Huy Nguyen
f729860a17 net/mlx5: Remove several module events out of ethtool stats
Remove the following module event counters out of ethtool stats. The
reason for removing these event counters is that these events do not
occur without techinician's intervention.
  module_pwr_budget_exd
  module_long_range
  module_no_eeprom
  module_enforce_part
  module_unknown_id
  module_unknown_status
  module_plug

Fixes: bedb7c909c ("net/mlx5e: Add port module event counters to ethtool stats")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed by: Gal Pressman <galp@mellanox.com>

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Mohamad Haj Yahia
3fece5d676 net/mlx5: Continue health polling until it is explicitly stopped
The issue is that when we get an assert we will stop polling the health
and thus we cant enter error state when we have a real health issue.

Fixes: fd76ee4da5 ('net/mlx5_core: Fix internal error detection conditions')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Reviewed-by: Daniel Jurgens <danielj@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
Mohamad Haj Yahia
57f35c93a2 net/mlx5: Fix create vport flow table flow
Send vport number to the create flow table inner method instead of
ignoring the vport argument and sending always 0.

Fixes: b3ba51498b ('net/mlx5: Refactor create flow table method to accept underlay QP')
Signed-off-by: Mohamad Haj Yahia <mohamad@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2017-06-11 13:10:36 +03:00
David S. Miller
b87fa0fafe Merge branch 'mvpp2-fixes'
Thomas Petazzoni says:

====================
net: mvpp2: driver fixes

As requested, here is a series of patches containing only bug fixes
for the mvpp2 driver. It is based on the latest "net" branch.

Changes since v1:

 - Fixed a build breakage that occurred when only PATCH 1 was only,
   and not later patches in the series. Was reported by the kbuild
   report on the first submission.

 - Added Tested-by from Marc Zyngier on PATCH 2.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:22:56 -04:00
Thomas Petazzoni
a704bb5c05 net: mvpp2: use {get, put}_cpu() instead of smp_processor_id()
smp_processor_id() should not be used in migration-enabled contexts. We
originally thought it was OK in the specific situation of this driver,
but it was wrong, and calling smp_processor_id() in a migration-enabled
context prints a big fat warning when CONFIG_DEBUG_PREEMPT=y.

Therefore, this commit replaces the smp_processor_id() in
migration-enabled contexts by the appropriate get_cpu/put_cpu sections.

Reported-by: Marc Zyngier <marc.zyngier@arm.com>
Fixes: a786841df7 ("net: mvpp2: handle register mapping and access for PPv2.2")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Tested-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:22:55 -04:00
Thomas Petazzoni
56b8aae959 net: mvpp2: remove mvpp2_bm_cookie_{build,pool_get}
This commit removes the useless remove
mvpp2_bm_cookie_{build,pool_get} functions. All what
mvpp2_bm_cookie_build() was doing is compute a 32-bit value by
concatenating the pool number and the CPU number... only to get the pool
number re-extracted by mvpp2_bm_cookie_pool_get() later on.

Instead, just get the pool number directly from RX descriptor status,
and pass it to mvpp2_pool_refill() and mvpp2_rx_refill().

This has the added benefit of dropping a smp_processor_id() call in a
migration-enabled context, which is wrong, and is the original
motivation for making this change.

Fixes: 3f518509de ("ethernet: Add new driver for Marvell Armada 375 network unit")
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:22:54 -04:00
Jia-Ju Bai
343eba69c6 net: tipc: Fix a sleep-in-atomic bug in tipc_msg_reverse
The kernel may sleep under a rcu read lock in tipc_msg_reverse, and the
function call path is:
tipc_l2_rcv_msg (acquire the lock by rcu_read_lock)
  tipc_rcv
    tipc_sk_rcv
      tipc_msg_reverse
        pskb_expand_head(GFP_KERNEL) --> may sleep
tipc_node_broadcast
  tipc_node_xmit_skb
    tipc_node_xmit
      tipc_sk_rcv
        tipc_msg_reverse
          pskb_expand_head(GFP_KERNEL) --> may sleep

To fix it, "GFP_KERNEL" is replaced with "GFP_ATOMIC".

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:20:38 -04:00
Jia-Ju Bai
f146e872eb net: caif: Fix a sleep-in-atomic bug in cfpkt_create_pfx
The kernel may sleep under a rcu read lock in cfpkt_create_pfx, and the
function call path is:
cfcnfg_linkup_rsp (acquire the lock by rcu_read_lock)
  cfctrl_linkdown_req
    cfpkt_create
      cfpkt_create_pfx
        alloc_skb(GFP_KERNEL) --> may sleep
cfserl_receive (acquire the lock by rcu_read_lock)
  cfpkt_split
    cfpkt_create_pfx
      alloc_skb(GFP_KERNEL) --> may sleep

There is "in_interrupt" in cfpkt_create_pfx to decide use "GFP_KERNEL" or
"GFP_ATOMIC". In this situation, "GFP_KERNEL" is used because the function
is called under a rcu read lock, instead in interrupt.

To fix it, only "GFP_ATOMIC" is used in cfpkt_create_pfx.

Signed-off-by: Jia-Ju Bai <baijiaju1990@163.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 18:19:45 -04:00
David S. Miller
5aa32f53ab Revert "net: fec: Add a fec_enet_clear_ethtool_stats() stub for CONFIG_M5272"
This reverts commit bf292f1b2c.

It belongs in 'net-next' not 'net'.

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:44:28 -04:00
Xin Long
581409dacc sctp: disable BH in sctp_for_each_endpoint
Now sctp holds read_lock when foreach sctp_ep_hashtable without disabling
BH. If CPU schedules to another thread A at this moment, the thread A may
be trying to hold the write_lock with disabling BH.

As BH is disabled and CPU cannot schedule back to the thread holding the
read_lock, while the thread A keeps waiting for the read_lock. A dead
lock would be triggered by this.

This patch is to fix this dead lock by calling read_lock_bh instead to
disable BH when holding the read_lock in sctp_for_each_endpoint.

Fixes: 626d16f50f ("sctp: export some apis or variables for sctp_diag and reuse some for proc")
Reported-by: Xiumei Mu <xmu@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:18:10 -04:00
Fabio Estevam
bf292f1b2c net: fec: Add a fec_enet_clear_ethtool_stats() stub for CONFIG_M5272
Commit 2b30842b23 ("net: fec: Clear and enable MIB counters on imx51")
introduced fec_enet_clear_ethtool_stats(), but missed to add a stub
for the CONFIG_M5272=y case, causing build failure for the
m5272c3_defconfig.

Add the missing empty stub to fix the build failure.

Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:16:21 -04:00
Dominik Heidler
9b3dc0a17d l2tp: cast l2tp traffic counter to unsigned
This fixes a counter problem on 32bit systems:
When the rx_bytes counter reached 2 GiB, it jumpd to (2^64 Bytes - 2GiB) Bytes.

rtnl_link_stats64 has __u64 type and atomic_long_read returns
atomic_long_t which is signed. Due to the conversation
we get an incorrect value on 32bit systems if the MSB of
the atomic_long_t value is set.

CC: Tom Parkin <tparkin@katalix.com>
Fixes: 7b7c0719cd ("l2tp: avoid deadlock in l2tp stats update")
Signed-off-by: Dominik Heidler <dheidler@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:14:27 -04:00
Philippe Reynes
d8dba51de5 net: aquantia: atlantic: remove declaration of hw_atl_utils_hw_set_power
This function is not defined, so no need to declare it.

As I don't have the hardware, I'd be very pleased if
someone may test this patch.

Signed-off-by: Philippe Reynes <tremyfr@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:09:56 -04:00
David S. Miller
d9a8d6a102 Merge branch 'bnx2x-Fix-malicious-VFs-indication'
Yuval Mintz says:

====================
bnx2x: Fix malicious VFs indication

It was discovered that for a VF there's a simple [yet uncommon] scenario
which would cause device firmware to declare that VF as malicious -
Add a vlan interface on top of a VF and disable txvlan offloading for
that VF [causing VF to transmit packets where vlan is on payload].

Patch #1 corrects driver transmission to prevent this issue.
Patch #2 is a by-product correcting PF behavior once a VF is declared
malicious.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:02:56 -04:00
Mintz, Yuval
3523882229 bnx2x: Don't post statistics to malicious VFs
Once firmware indicates that a given VF is malicious and until
that VF passes an FLR all bets are off - PF can't know anything
is happening to the VF [since VF can't communicate anything to its PF].
But PF is currently still periodically asking device to collect
statistics for the VF which might in turn fill logs by IOMMU blocking
memory access done by the VF's PCI function [in the case VF has unmapped
its buffers].

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:02:55 -04:00
Mintz, Yuval
92f85f05ca bnx2x: Allow vfs to disable txvlan offload
VF clients are configured as enforced, meaning firmware is validating
the correctness of their ethertype/vid during transmission.
Once txvlan is disabled, VF would start getting SKBs for transmission
here vlan is on the payload - but it'll pass the packet's ethertype
instead of the vid, leading to firmware declaring it as malicious.

Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-10 16:02:54 -04:00
David S. Miller
f6d4c71332 linux-can-fixes-for-4.12-20170609
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEE4bay/IylYqM/npjQHv7KIOw4HPYFAlk6mLwTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAe/sog7Dgc9vx5CADTRLfMxgmmGsLcM8nO0sKn/7X/o4Hy
 qprZy1Nu4J49kG/93pumIvWkcG9fzZ4hbDy+4DiEmL/fIumbvg56aaW90fyvW9Cl
 HzL3S7Id8Db6Qwfr4Wd7r1IRYn08+om4ukO09GdskmMpS4wOr68+3WhutGGu1mEX
 ZS2sTE/dKnozZqUw6agcwXJmAJWFbYBaVcF+x2GT07ako0jYXF29zqe/GlpmzVBJ
 Wmib9DO2BeDm/QpNIot/I3Se86xS0AIH6ds++YRJX6k9dlFb4T2Liw6UysPSVCz6
 QkAT3SiTu3TKoT1TXEyMQEQSpnrFtpMowdTk/s04bKibVIpTTskZm44c
 =DNKW
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-4.12-20170609' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2017-06-09

this is a pull request of 6 patches for net/master.

There's a patch by Stephane Grosjean that fixes an uninitialized symbol warning
in the peak_canfd driver. A patch by Johan Hovold to fix the product-id
endianness in an error message in the the peak_usb driver. A patch by Oliver
Hartkopp to enable CAN FD for virtual CAN devices by default. Three patches by
me, one makes the helper function can_change_state() robust to be called with
cf == NULL. The next patch fixes a memory leak in the gs_usb driver. And the
last one fixes a lockdep splat by properly initialize the per-net
can_rcvlists_lock spin_lock.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09 15:41:57 -04:00
Johannes Berg
c7a61cba71 mac80211: free netdev on dev_alloc_name() error
The change to remove free_netdev() from ieee80211_if_free()
erroneously didn't add the necessary free_netdev() for when
ieee80211_if_free() is called directly in one place, rather
than as the priv_destructor. Add the missing call.

Fixes: cf124db566 ("net: Fix inconsistent teardown and release of private netdev state.")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09 15:40:15 -04:00
ashwanth@codeaurora.org
773fc8f6e8 net: rps: send out pending IPI's on CPU hotplug
IPI's from the victim cpu are not handled in dev_cpu_callback.
So these pending IPI's would be sent to the remote cpu only when
NET_RX is scheduled on the victim cpu and since this trigger is
unpredictable it would result in packet latencies on the remote cpu.

This patch add support to send the pending ipi's of victim cpu.

Signed-off-by: Ashwanth Goli <ashwanth@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09 15:35:01 -04:00
Mario Molitor
33d4c48213 stmmac: fix for hw timestamp of GMAC3 unit
1.) Bugfix of function stmmac_get_tx_hwtstamp.
    Corrected the tx timestamp available check (same as 4.8 and older)
    Change printout from info syslevel to debug.

2.) Bugfix of function stmmac_get_rx_hwtstamp.
    Corrected the rx timestamp available check (same as 4.8 and older)
    Change printout from info syslevel to debug.

Fixes: ba1ffd74df ("stmmac: fix PTP support for GMAC4")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09 12:39:26 -04:00
Mario Molitor
fd6720aefd stmmac: fix ptp header for GMAC3 hw timestamp
According the CYCLON V documention only the bit 16 of snaptypesel should
set.
(more information see Table 17-20 (cv_5v4.pdf) :
 Timestamp Snapshot Dependency on Register Bits)

Fixes: d2042052a0 ("stmmac: update the PTP header file")
Signed-off-by: Mario Molitor <mario_molitor@web.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09 12:39:25 -04:00
Krister Johansen
f186ce61bb Fix an intermittent pr_emerg warning about lo becoming free.
It looks like this:

Message from syslogd@flamingo at Apr 26 00:45:00 ...
 kernel:unregister_netdevice: waiting for lo to become free. Usage count = 4

They seem to coincide with net namespace teardown.

The message is emitted by netdev_wait_allrefs().

Forced a kdump in netdev_run_todo, but found that the refcount on the lo
device was already 0 at the time we got to the panic.

Used bcc to check the blocking in netdev_run_todo.  The only places
where we're off cpu there are in the rcu_barrier() and msleep() calls.
That behavior is expected.  The msleep time coincides with the amount of
time we spend waiting for the refcount to reach zero; the rcu_barrier()
wait times are not excessive.

After looking through the list of callbacks that the netdevice notifiers
invoke in this path, it appears that the dst_dev_event is the most
interesting.  The dst_ifdown path places a hold on the loopback_dev as
part of releasing the dev associated with the original dst cache entry.
Most of our notifier callbacks are straight-forward, but this one a)
looks complex, and b) places a hold on the network interface in
question.

I constructed a new bcc script that watches various events in the
liftime of a dst cache entry.  Note that dst_ifdown will take a hold on
the loopback device until the invalidated dst entry gets freed.

[      __dst_free] on DST: ffff883ccabb7900 IF tap1008300eth0 invoked at 1282115677036183
    __dst_free
    rcu_nocb_kthread
    kthread
    ret_from_fork
Acked-by: Eric Dumazet <edumazet@google.com>

Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09 12:27:28 -04:00
Mateusz Jurczyk
defbcf2dec af_unix: Add sockaddr length checks before accessing sa_family in bind and connect handlers
Verify that the caller-provided sockaddr structure is large enough to
contain the sa_family field, before accessing it in bind() and connect()
handlers of the AF_UNIX socket. Since neither syscall enforces a minimum
size of the corresponding memory region, very short sockaddrs (zero or
one byte long) result in operating on uninitialized memory while
referencing .sa_family.

Signed-off-by: Mateusz Jurczyk <mjurczyk@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09 10:10:24 -04:00
Joe Perches
fc5b775da4 net: phy: add missing SPEED_14000
Fixes: 0d7e2d2166 ("IB/ipoib: add get_link_ksettings in ethtool")
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-09 09:53:25 -04:00
Oliver Hartkopp
97edec3a11 can: enable CAN FD for virtual CAN devices by default
CAN FD capable CAN interfaces can handle (classic) CAN 2.0 frames too.
New users usually fail at their first attempt to explore CAN FD on
virtual CAN interfaces due to the current CAN_MTU default.

Set the MTU to CANFD_MTU by default to reduce this confusion.
If someone *really* needs a 'classic CAN'-only device this can be set
with the 'ip' tool with e.g. 'ip link set vcan0 mtu 16' as before.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-06-09 14:39:02 +02:00
Marc Kleine-Budde
74b7b49088 can: af_can: namespace support: fix lockdep splat: properly initialize spin_lock
This patch uses spin_lock_init() instead of __SPIN_LOCK_UNLOCKED() to
initialize the per namespace net->can.can_rcvlists_lock lock to fix this
lockdep warning:

| INFO: trying to register non-static key.
| the code is fine but needs lockdep annotation.
| turning off the locking correctness validator.
| CPU: 0 PID: 186 Comm: candump Not tainted 4.12.0-rc3+ #47
| Hardware name: Marvell Kirkwood (Flattened Device Tree)
| [<c0016644>] (unwind_backtrace) from [<c00139a8>] (show_stack+0x18/0x1c)
| [<c00139a8>] (show_stack) from [<c0058c8c>] (register_lock_class+0x1e4/0x55c)
| [<c0058c8c>] (register_lock_class) from [<c005bdfc>] (__lock_acquire+0x148/0x1990)
| [<c005bdfc>] (__lock_acquire) from [<c005deec>] (lock_acquire+0x174/0x210)
| [<c005deec>] (lock_acquire) from [<c04a6780>] (_raw_spin_lock+0x50/0x88)
| [<c04a6780>] (_raw_spin_lock) from [<bf02116c>] (can_rx_register+0x94/0x15c [can])
| [<bf02116c>] (can_rx_register [can]) from [<bf02a868>] (raw_enable_filters+0x60/0xc0 [can_raw])
| [<bf02a868>] (raw_enable_filters [can_raw]) from [<bf02ac14>] (raw_enable_allfilters+0x2c/0xa0 [can_raw])
| [<bf02ac14>] (raw_enable_allfilters [can_raw]) from [<bf02ad38>] (raw_bind+0xb0/0x250 [can_raw])
| [<bf02ad38>] (raw_bind [can_raw]) from [<c03b5fb8>] (SyS_bind+0x70/0xac)
| [<c03b5fb8>] (SyS_bind) from [<c000f8c0>] (ret_fast_syscall+0x0/0x1c)

Cc: Mario Kicherer <dev@kicherer.org>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-06-09 11:39:23 +02:00
Marc Kleine-Budde
5cda3ee513 can: gs_usb: fix memory leak in gs_cmd_reset()
This patch adds the missing kfree() in gs_cmd_reset() to free the
memory that is not used anymore after usb_control_msg().

Cc: linux-stable <stable@vger.kernel.org>
Cc: Maximilian Schneider <max@schneidersoft.net>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-06-09 11:39:23 +02:00
Johan Hovold
dadcd398b3 can: peak_usb: fix product-id endianness in error message
Make sure to use the USB device product-id stored in host-byte order in
a probe error message.

Also remove a redundant reassignment of the local usb_dev variable which
had already been used to retrieve the product id.

Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-06-09 11:39:23 +02:00
Stephane Grosjean
f2a918b40c can: peak_canfd: fix uninitialized symbol warnings
This patch fixes two uninitialized symbol warnings in the new code adding
support of the PEAK-System PCAN-PCI Express FD boards, in the socket-CAN
network protocol family.

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-06-09 11:39:23 +02:00
Marc Kleine-Budde
ff3416fb5b can: dev: make can_change_state() robust to be called with cf == NULL
In OOM situations where no skb can be allocated, can_change_state() may
be called with cf == NULL. As this function updates the state and error
statistics it's not an option to skip the call to can_change_state() in
OOM situations.

This patch makes can_change_state() robust, so that it can be called
with cf == NULL.

Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2017-06-09 11:39:23 +02:00
David Ahern
097d3c9508 net: vrf: Make add_fib_rules per network namespace flag
Commit 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
adds the l3mdev FIB rule the first time a VRF device is created. However,
it only creates the rule once and only in the namespace the first device
is created - which may not be init_net. Fix by using the net_generic
capability to make the add_fib_rules flag per network namespace.

Fixes: 1aa6c4f6b8 ("net: vrf: Add l3mdev rules on first device create")
Reported-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 19:27:42 -04:00
Daniel Borkmann
78a5a93c1e bpf, tests: fix endianness selection
I noticed that test_l4lb was failing in selftests:

  # ./test_progs
  test_pkt_access:PASS:ipv4 77 nsec
  test_pkt_access:PASS:ipv6 44 nsec
  test_xdp:PASS:ipv4 2933 nsec
  test_xdp:PASS:ipv6 1500 nsec
  test_l4lb:PASS:ipv4 377 nsec
  test_l4lb:PASS:ipv6 544 nsec
  test_l4lb:FAIL:stats 6297600000 200000
  test_tcp_estats:PASS: 0 nsec
  Summary: 7 PASSED, 1 FAILED

Tracking down the issue actually revealed that endianness selection
in bpf_endian.h is broken when compiled with clang with bpf target.
test_pkt_access.c, test_l4lb.c is compiled with __BYTE_ORDER as
__BIG_ENDIAN, test_xdp.c as __LITTLE_ENDIAN! test_l4lb noticeably
fails, because the test accounts bytes via bpf_ntohs(ip6h->payload_len)
and bpf_ntohs(iph->tot_len), and compares them against a defined
value and given a wrong endianness, the test outcome is different,
of course.

Turns out that there are actually two bugs: i) when we do __BYTE_ORDER
comparison with __LITTLE_ENDIAN/__BIG_ENDIAN, then depending on the
include order we see different outcomes. Reason is that __BYTE_ORDER
is undefined due to missing endian.h include. Before we include the
asm/byteorder.h (e.g. through linux/in.h), then __BYTE_ORDER equals
__LITTLE_ENDIAN since both are undefined, after the include which
correctly pulls in linux/byteorder/little_endian.h, __LITTLE_ENDIAN
is defined, but given __BYTE_ORDER is still undefined, we match on
__BYTE_ORDER equals to __BIG_ENDIAN since __BIG_ENDIAN is also
undefined at that point, sigh. ii) But even that would be wrong,
since when compiling the test cases with clang, one can select between
bpfeb and bpfel targets for cross compilation. Hence, we can also not
rely on what the system's endian.h provides, but we need to look at
the compiler's defined endianness. The compiler defines __BYTE_ORDER__,
and we can match __ORDER_LITTLE_ENDIAN__ and __ORDER_BIG_ENDIAN__,
which also reflects targets bpf (native), bpfel, bpfeb correctly,
thus really only rely on that. After patch:

  # ./test_progs
  test_pkt_access:PASS:ipv4 74 nsec
  test_pkt_access:PASS:ipv6 42 nsec
  test_xdp:PASS:ipv4 2340 nsec
  test_xdp:PASS:ipv6 1461 nsec
  test_l4lb:PASS:ipv4 400 nsec
  test_l4lb:PASS:ipv6 530 nsec
  test_tcp_estats:PASS: 0 nsec
  Summary: 7 PASSED, 0 FAILED

Fixes: 43bcf707cc ("bpf: fix _htons occurences in test_progs")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-08 16:17:29 -04:00