Commit 37706996 "mlx4_en: fix allocation of CPU affinity reverse-map" fixed
a bug when mlx4_dev->caps.comp_pool is larger from the device rx rings, but
introduced a regression.
When the mlx4_core is activating its "legacy mode" (e.g when running in SRIOV
mode) w.r.t to EQs/IRQs usage, comp_pool becomes zero and we're crashing on
divide by zero alloc_cpu_rmap.
Fix that by enabling RFS only when running in non-legacy mode.
Reported-by: Yan Burman <yanb@mellanox.com>
Cc: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure we cleanup all MAC related resources (entries in the port MAC
table and steering rules) when stopping a port or when the driver is unloaded.
The leak was introduced by commit 07cb4b0a "net/mlx4_en: Manage hash of MAC
addresses per port".
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Remove unnecessary use of workqueue for the device MAC address setting
flow, and fix a race when setting MAC address which was introduced by
commit c07cb4b0a "net/mlx4_en: Manage hash of MAC addresses per port"
The race happened when mlx4_en_replace_mac was being executed in parallel
with a successive call to ndo_set_mac_address, e.g witn an A/B/A MAC
setting configuration test, the third set fails.
With this change we also properly report an error if set MAC fails.
Signed-off-by: Yan Burman <yanb@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The set_param_l function assumes casting a u64 pointer to a u32 pointer
allows to access the lower 32bits, but it results in writing the upper
32 bits on big endian systems.
The fixed function reads the upper 32 bits of the 64 argument, and or's
them with the 32 bits of the 32-bit value passed to the function.
Since this is now a "read-modify-write" operation, we got many
"unintialized variable" warnings which needed to be fixed as well.
Reported-by: Alexander Schmidt <alexschm@de.ibm.com>.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Older kernels detect DMFS (device-managed flow steering) from the HCA
device capability directly, regardless of whether the capability was
enabled in INIT_HCA, this is fixed by commit 7b8157bed "mlx4_core: Adjustments
to Flow Steering activation logic for SR-IOV"
To protect against guests running kernels without this fix, the host driver
should turn off the DMFS capability bit in mlx4_QUERY_DEV_CAP_wrapper.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guests kernels may not correctly detect if DMFS (device-enabled flow steering) is
activated by the host. If DMFS is activated, the master should return error to guests
which try to use the B0-steering flow calls (mlx4_QP_ATTACH).
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch fixes some annoying messages like 'Error reading PHY register' and
'Hardware Erorr' and saves several seconds on reboot.
Cc: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch removes redundant actions from driver and fixes its interaction
with actions in pci-bus runtime power management code.
It removes pci_save_state() from __e1000_shutdown() for normal adapters,
PCI bus callbacks pci_pm_*() will do all this for us. Now __e1000_shutdown()
switches to D3-state only quad-port adapters, because they needs quirk for
clearing false-positive error from downsteam pci-e port.
pci_save_state() now called after clearing bus-master bit, thus __e1000_resume()
and e1000_io_slot_reset() must set it back after restoring configuration space.
This patch set get_link_status before calling pm_runtime_put() in e1000_open()
to allow e1000_idle() get real link status and schedule first runtime suspend.
This patch also enables wakeup for device if management mode is enabled
(like for WoL) as result pci_prepare_to_sleep() would setup wakeup without
special actions like custom 'enable_wakeup' sign.
Cc: Bruce Allan <bruce.w.allan@intel.com>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch removes redundant and unbalanced pci_disable_device() from
__e1000_shutdown(). pci_clear_master() is enough, device can go into
suspended state with elevated enable_cnt.
Bug was introduced in commit 23606cf5d1
("e1000e / PCI / PM: Add basic runtime PM support (rev. 4)") in v2.6.35
Cc: Bruce Allan <bruce.w.allan@intel.com>
CC: Stable <stable@kernel.org>
Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Borislav Petkov <bp@suse.de>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
The SLIPORT_SEMAPHORE register shadowed in the
config-space may not reflect the correct POST stage after
an EEH reset in BE2/3; it may return FW_READY state even though
FW is not ready. This causes the driver to prematurely
poll the FW mailbox and fail.
For BE2/3 use the CSR-BAR/0xac instead.
Reported-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings says:
====================
Fix regressions introduced by the last set of fixes (sorry):
1. Potential deadlock when disabling TX queues.
2. RX was broken on architectures other than x86 and powerpc.
I still expect to send one more bug fix for 3.9, but as it sometimes
takes days to reproduce the bug it's going to take a couple of weeks of
testing to be confident that it's really fixed.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
RX DMA buffers start at an offset of EFX_PAGE_IP_ALIGN bytes from the
start of a cache line. This offset obviously needs to be included in
the virtual address, but this was missed in commit b590ace09d
('sfc: Fix efx_rx_buf_offset() in the presence of swiotlb') since
EFX_PAGE_IP_ALIGN is equal to 0 on both x86 and powerpc.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
efx_device_detach_sync() locks all TX queues before marking the device
detached and thus disabling further TX scheduling. But it can still
be interrupted by TX completions which then result in TX scheduling in
soft interrupt context. This will deadlock when it tries to acquire
a TX queue lock that efx_device_detach_sync() already acquired.
To avoid deadlock, we must use netif_tx_{,un}lock_bh().
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
While PCI card faces EEH errors, reset (usually hot reset) is
expected to recover from the EEH errors. After EEH core finishes
the reset, the driver callback (be_eeh_reset) is called and wait
the firmware to complete POST successfully. The original code would
return with error once detecting failure during POST stage. That
seems not enough.
The patch forces the driver (be_eeh_reset) to wait the firmware
completes POST until timeout, instead of returning error upon
detection POST failure immediately. Also, it would improve the
reliability of the EEH funtionality of the driver.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Acked-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jeff Kirsher says:
===================
This series contains fixes to e1000e and igb.
The e1000e fix resolves an issue at 1000Mbps link speed, where one of the
MAC's internal clocks can be stopped for up to 4us when entering K1 (a
power mode of the MAC-PHY interconnect). If the MAC is waiting for
completion indications for 2 DMA write requests into Host memory
(e.g. descriptor writeback or Rx packet writing) and the
indications occur while the clock is stopped, both indications will be
missed by the MAC causing the MAC to wait for the completion indications
and be unable to generate further DMA write requests. This results in an
apparent hardware hang. The patch works-around the issue by disabling
the de-assertion of the clock request when 1000Mbps link is acquired (K1
must be disabled while doing this).
The igb fix to drop BUILD_BUG_ON check from igb_build_rx_buffer resolves
a build error on s390 devices. The igb driver was throwing a build error
due to the fact that a frame built using build_skb would be larger than 2K.
Since this is not likely to change at any point in the future we are better
off just dropping the check since we already had a check in
igb_set_rx_buffer_len that will just disable the usage of build_skb anyway.
The igb fix for i210 link setup changes the setup copper link function
to use a switch statement, so that the appropriate setup link function
is called for the given PHY types.
Lastly, the igb fix for a lockdep issue in igb_get_i2c_client resolves
the issue by re-factoring the initialization and usage of the i2c_client.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"A moderately sized pile of fixes, some specifically for merge window
introduced regressions although others are for longer standing items
and have been queued up for -stable.
I'm kind of tired of all the RDS protocol bugs over the years, to be
honest, it's way out of proportion to the number of people who
actually use it.
1) Fix missing range initialization in netfilter IPSET, from Jozsef
Kadlecsik.
2) ieee80211_local->tim_lock needs to use BH disabling, from Johannes
Berg.
3) Fix DMA syncing in SFC driver, from Ben Hutchings.
4) Fix regression in BOND device MAC address setting, from Jiri
Pirko.
5) Missing usb_free_urb in ISDN Hisax driver, from Marina Makienko.
6) Fix UDP checksumming in bnx2x driver for 57710 and 57711 chips,
fix from Dmitry Kravkov.
7) Missing cfgspace_lock initialization in BCMA driver.
8) Validate parameter size for SCTP assoc stats getsockopt(), from
Guenter Roeck.
9) Fix SCTP association hangs, from Lee A Roberts.
10) Fix jumbo frame handling in r8169, from Francois Romieu.
11) Fix phy_device memory leak, from Petr Malat.
12) Omit trailing FCS from frames received in BGMAC driver, from Hauke
Mehrtens.
13) Missing socket refcount release in L2TP, from Guillaume Nault.
14) sctp_endpoint_init should respect passed in gfp_t, rather than use
GFP_KERNEL unconditionally. From Dan Carpenter.
15) Add AISX AX88179 USB driver, from Freddy Xin.
16) Remove MAINTAINERS entries for drivers deleted during the merge
window, from Cesar Eduardo Barros.
17) RDS protocol can try to allocate huge amounts of memory, check
that the user's request length makes sense, from Cong Wang.
18) SCTP should use the provided KMALLOC_MAX_SIZE instead of it's own,
bogus, definition. From Cong Wang.
19) Fix deadlocks in FEC driver by moving TX reclaim into NAPI poll,
from Frank Li. Also, fix a build error introduced in the merge
window.
20) Fix bogus purging of default routes in ipv6, from Lorenzo Colitti.
21) Don't double count RTT measurements when we leave the TCP receive
fast path, from Neal Cardwell."
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (61 commits)
tcp: fix double-counted receiver RTT when leaving receiver fast path
CAIF: fix sparse warning for caif_usb
rds: simplify a warning message
net: fec: fix build error in no MXC platform
net: ipv6: Don't purge default router if accept_ra=2
net: fec: put tx to napi poll function to fix dead lock
sctp: use KMALLOC_MAX_SIZE instead of its own MAX_KMALLOC_SIZE
rds: limit the size allocated by rds_message_alloc()
MAINTAINERS: remove eexpress
MAINTAINERS: remove drivers/net/wan/cycx*
MAINTAINERS: remove 3c505
caif_dev: fix sparse warnings for caif_flow_cb
ax88179_178a: ASIX AX88179_178A USB 3.0/2.0 to gigabit ethernet adapter driver
sctp: use the passed in gfp flags instead GFP_KERNEL
ipv[4|6]: correct dropwatch false positive in local_deliver_finish
l2tp: Restore socket refcount when sendmsg succeeds
net/phy: micrel: Disable asymmetric pause for KSZ9021
bgmac: omit the fcs
phy: Fix phy_device_free memory leak
bnx2x: Fix KR2 work-around condition
...
This patch fixes a lockdep warning in igb_get_i2c_client by
refactoring the initialization and usage of the i2c_client
completely. There is no on the fly allocation of the single
client needed today.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
This patch changes the setup copper link function to use a switch
statement for the PHY id's available for the given PHY types. It
also adds a case for the I210 PHY id, so the appropriate setup link
function is called for it.
Signed-off-by: Carolyn Wyborny <carolyn.wyborny@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
On s390 the igb driver was throwing a build error due to the fact that a frame
built using build_skb would be larger than 2K. Since this is not likely to
change at any point in the future we are better off just dropping the check
since we already had a check in igb_set_rx_buffer_len that will just disable
the usage of build_skb anyway.
Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com>
Tested-by: Aaron Brown <aaron.f.brown@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
At 1000Mbps link speed, one of the MAC's internal clocks can be stopped for
up to 4us when entering K1 (a power mode of the MAC-PHY interconnect). If
the MAC is waiting for completion indications for 2 DMA write requests into
Host memory (e.g. descriptor writeback or Rx packet writing) and the
indications occur while the clock is stopped, both indications will be
missed by the MAC causing the MAC to wait for the completion indications
and be unable to generate further DMA write requests. This results in an
apparent hardware hang.
Work-around the issue by disabling the de-assertion of the clock request
when 1000Mbps link is acquired (K1 must be disabled while doing this).
Signed-off-by: Bruce Allan <bruce.w.allan@intel.com>
Tested-by: Jeff Pieper <jeffrey.e.pieper@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
build error cause by
Commit ff43da86c6
("NET: FEC: dynamtic check DMA desc buff type")
drivers/net/ethernet/freescale/fec.c: In function ‘fec_enet_get_nextdesc’:
drivers/net/ethernet/freescale/fec.c:215:18: error: invalid use of undefined type ‘struct bufdesc_ex’
drivers/net/ethernet/freescale/fec.c: In function ‘fec_enet_get_prevdesc’:
drivers/net/ethernet/freescale/fec.c:224:18: error: invalid use of undefined type ‘struct bufdesc_ex’
drivers/net/ethernet/freescale/fec.c: In function ‘fec_enet_start_xmit’:
drivers/net/ethernet/freescale/fec.c:286:37: error: arithmetic on pointer to an incomplete type
drivers/net/ethernet/freescale/fec.c:287:13: error: arithmetic on pointer to an incomplete type
drivers/net/ethernet/freescale/fec.c:324:7: error: dereferencing pointer to incomplete type etc....
Signed-off-by: Frank Li <Frank.Li@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull MIPS updates from Ralf Baechle:
o Add basic support for the Mediatek/Ralink Wireless SoC family.
o The Qualcomm Atheros platform is extended by support for the new
QCA955X SoC series as well as a bunch of patches that get the code
ready for OF support.
o Lantiq and BCM47XX platform have a few improvements and bug fixes.
o MIPS has sent a few patches that get the kernel ready for the
upcoming microMIPS support.
o The rest of the series is made up of small bug fixes and cleanups
that relate to various parts of the MIPS code. The biggy in there is
a whitespace cleanup. After I was sent another set of whitespace
cleanup patches I decided it was the time to clean the whitespace
"issues" for once and and that touches many files below arch/mips/.
Fix up silly conflicts, mostly due to whitespace cleanups.
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (105 commits)
MIPS: Quit exporting kernel internel break codes to uapi/asm/break.h
MIPS: remove broken conditional inside vpe loader code
MIPS: SMTC: fix implicit declaration of set_vi_handler
MIPS: early_printk: drop __init annotations
MIPS: Probe for and report hardware virtualization support.
MIPS: ath79: add support for the Qualcomm Atheros AP136-010 board
MIPS: ath79: add USB controller registration code for the QCA955X SoCs
MIPS: ath79: add PCI controller registration code for the QCA955X SoCs
MIPS: ath79: add WMAC registration code for the QCA955X SoCs
MIPS: ath79: register UART for the QCA955X SoCs
MIPS: ath79: add QCA955X specific glue to ath79_device_reset_{set, clear}
MIPS: ath79: add GPIO setup code for the QCA955X SoCs
MIPS: ath79: add IRQ handling code for the QCA955X SoCs
MIPS: ath79: add clock setup code for the QCA955X SoCs
MIPS: ath79: add SoC detection code for the QCA955X SoCs
MIPS: ath79: add early printk support for the QCA955X SoCs
MIPS: ath79: fix WMAC IRQ resource assignment
mips: reserve elfcorehdr
mips: Make sure kernel memory is in iomem
MIPS: ath79: use dynamically allocated USB platform devices
...
Do not include the frame check sequence when adding the skb to
netif_receive_skb(). This causes problems when this interface was
bridged to a wifi ap and a big package should be forwarded from this
Ethernet driver through a bride to the wifi client.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix condition typo for running KR2 work-around though it doesn't have
real effect since the typo bits matched by chance.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix KR2 link down problem after reboot when link speed is reconfigured via ethtool.
Since 1G/10G support link speed were missing by default, 1G/10G link speed were
not advertised.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the "ethtool -p" for boards with BCM84834, by using LED4 of the PHY
to toggle the link LED while keeping interrupt disabled to avoid NIG attentions,
and at the end restore NIG to previous state.
Signed-off-by: Yaniv Rosner <yanivr@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Some hardware start settings implicitely assume an usual 1500 bytes mtu
that can't be guaranteed because changes of mtu may be requested both
before and after the hardware is started.
Reported-by: Tomi Orava <tomimo@ncircle.nullnet.fi>
Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
Cc: Hayes Wang <hayeswang@realtek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
I'm not sure why, but the hlist for each entry iterators were conceived
list_for_each_entry(pos, head, member)
The hlist ones were greedy and wanted an extra parameter:
hlist_for_each_entry(tpos, pos, head, member)
Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.
Besides the semantic patch, there was some manual work required:
- Fix up the actual hlist iterators in linux/list.h
- Fix up the declaration of other iterators based on the hlist ones.
- A very small amount of places were using the 'node' parameter, this
was modified to use 'obj->member' instead.
- Coccinelle didn't handle the hlist_for_each_entry_safe iterator
properly, so those had to be fixed up manually.
The semantic patch which is mostly the work of Peter Senna Tschudin is here:
@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;
type T;
expression a,c,d,e;
identifier b;
statement S;
@@
-T b;
<+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
...+>
[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin <peter.senna@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Signed-off-by: Sasha Levin <sasha.levin@oracle.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since commit 86564c3f "bnx2x: Remove many sparse warnings" UDP
csum offload is broken for 57710/57711. Fix return value.
Signed-off-by: Dmitry Kravkov <dmitry@broadcom.com>
CC: Ariel Elior <ariele@broadcom.com>
CC: Yuval Mintz <yuvalmin@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull vfs pile (part one) from Al Viro:
"Assorted stuff - cleaning namei.c up a bit, fixing ->d_name/->d_parent
locking violations, etc.
The most visible changes here are death of FS_REVAL_DOT (replaced with
"has ->d_weak_revalidate()") and a new helper getting from struct file
to inode. Some bits of preparation to xattr method interface changes.
Misc patches by various people sent this cycle *and* ocfs2 fixes from
several cycles ago that should've been upstream right then.
PS: the next vfs pile will be xattr stuff."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (46 commits)
saner proc_get_inode() calling conventions
proc: avoid extra pde_put() in proc_fill_super()
fs: change return values from -EACCES to -EPERM
fs/exec.c: make bprm_mm_init() static
ocfs2/dlm: use GFP_ATOMIC inside a spin_lock
ocfs2: fix possible use-after-free with AIO
ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path
get_empty_filp()/alloc_file() leave both ->f_pos and ->f_version zero
target: writev() on single-element vector is pointless
export kernel_write(), convert open-coded instances
fs: encode_fh: return FILEID_INVALID if invalid fid_type
kill f_vfsmnt
vfs: kill FS_REVAL_DOT by adding a d_weak_revalidate dentry op
nfsd: handle vfs_getattr errors in acl protocol
switch vfs_getattr() to struct path
default SET_PERSONALITY() in linux/elf.h
ceph: prepopulate inodes only when request is aborted
d_hash_and_lookup(): export, switch open-coded instances
9p: switch v9fs_set_create_acl() to inode+fid, do it before d_instantiate()
9p: split dropping the acls from v9fs_set_create_acl()
...
Make cpsw_add_default_vlan() look at the actual number of slaves for its
iteration, so boards with less than 2 slaves don't ooops at boot.
Signed-off-by: Daniel Mack <zonque@gmail.com>
Cc: Mugunthan V N <mugunthanvnm@ti.com>
Cc: David S. Miller <davem@davemloft.net>
Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ben Hutchings says:
====================
Some fixes that should go into 3.9:
1. Fix packet corruption when using non-coherent RX DMA buffers.
2. Fix occasional watchdog misfiring when changing MTU or ring size.
These are longstanding bugs and should be fixed in stable as well, but
I'd like to review other recent fixes first and send a separate request
for stable inclusion.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) ping_err() ICMP error handler looks at wrong ICMP header, from Li
Wei.
2) TCP socket hash function on ipv6 is too weak, from Eric Dumazet.
3) netif_set_xps_queue() forgets to drop mutex on errors, fix from
Alexander Duyck.
4) sum_frag_mem_limit() can deadlock due to lack of BH disabling, fix
from Eric Dumazet.
5) TCP SYN data is miscalculated in tcp_send_syn_data(), because the
amount of TCP option space was not taken into account properly in
this code path. Fix from yuchung Cheng.
6) MLX4 driver allocates device queues with the wrong size, from Kleber
Sacilotto.
7) sock_diag can access past the end of the sock_diag_handlers[] array,
from Mathias Krause.
8) vlan_set_encap_proto() makes incorrect assumptions about where
skb->data points, rework the logic so that it works regardless of
where skb->data happens to be. From Jesse Gross.
9) Fix gianfar build failure with NET_POLL enabled, from Paul
Gortmaker.
10) Fix Ipv4 ID setting and checksum calculations in GRE driver, from
Pravin B Shelar.
11) bgmac driver does:
int i;
for (i = 0; ...; ...) {
...
for (i = 0; ...; ...) {
effectively corrupting the outer loop index, use a seperate
variable for the inner loops. From Rafał Miłecki.
12) Fix suspend bugs in smsc95xx driver, from Ming Lei.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (35 commits)
usbnet: smsc95xx: rename FEATURE_AUTOSUSPEND
usbnet: smsc95xx: fix broken runtime suspend
usbnet: smsc95xx: fix suspend failure
bgmac: fix indexing of 2nd level loops
b43: Fix lockdep splat on module unload
Revert "ip_gre: propogate target device GSO capability to the tunnel device"
IP_GRE: Fix GRE_CSUM case.
VXLAN: Use tunnel_ip_select_ident() for tunnel IP-Identification.
IP_GRE: Fix IP-Identification.
net/pasemi: Fix missing coding style
vmxnet3: fix ethtool ring buffer size setting
vmxnet3: make local function static
bnx2x: remove dead code and make local funcs static
gianfar: fix compile fail for NET_POLL=y due to struct packing
vlan: adjust vlan_set_encap_proto() for its callers
sock_diag: Simplify sock_diag_handlers[] handling in __sock_diag_rcv_msg
sock_diag: Fix out-of-bounds access to sock_diag_handlers[]
vxlan: remove depends on CONFIG_EXPERIMENTAL
mlx4_en: fix allocation of CPU affinity reverse-map
mlx4_en: fix allocation of device tx_cq
...
- SRP error handling fixes from Bart Van Assche
- Implementation of memory windows for mlx4 from Shani Michaeli
- Lots of cxgb4 HW driver fixes from Vipul Pandya
- Make iSER work for virtual functions, other fixes from Or Gerlitz
- Fix for bug in qib HW driver from Mike Marciniszyn
- IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman
- Various cleanups and warning fixes from Julia Lawall, Paul Bolle, Wei Yongjun
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABCAAGBQJRLPNoAAoJEENa44ZhAt0h0bMP/1xqlgDR3DGdoUoV4yTd6O9G
Uccdb7og5o5tedVo+8xZz01y4at99P8FWW4hJ4os6k8n6yKoBVwo7qjN+BOR+JG5
q8+O+ynUSIg4tGrb5sXcMnKXAXbw/vkftMWYNA41cbrM24DTYzB/2SLpvhwbFoTT
tdQc2tgz5QaDqzWbagyCR4+k/IgO+Llrz/RvIdtz4dsTnTDogN7QCoSffX8n/Lpb
DxtyXK4sdl3DAtd3CsIdsB/TSMb3RkHLCoSvmrWlLnqMdsbRxVnCVfBm4BOghW3J
Y2K3joRoCjjIZSRNs/i0FMFkT/jbCXg1oXg9ek/a6YFNcgyk7z8iGyXrRY7fOnno
8U2SfxJ69YpVYeJr+DSjaeHcmjsaYU7NN7JPxzvPKcJOIsxQJ/euJDXAXau3lEQY
o9/p4JsGty0WHi1NanyygvghvBAoP1C5/59Sl4bHH5gckPyJT1kinPSCTT76YXGS
WkSHg2mDhiJHy7Pnuy85iZldPoy2/5z09/I4aGMeL+8kUZbD4iFqzXIJU0HTsAim
EONoRXDhIcN5DNVSVH1ig6nJ2a7Vhov4Z0r/vB8P4KhslBcqFwf2leC0eCoe5mNt
SzcKhqosZDXoL8AwzpntzGIOid8pWmHbUx/PgIcoVXPjtl0h2ULNIFoYYyMZ3cyU
AyN2tSiUZVddTV1/aKGL
=RAQw
-----END PGP SIGNATURE-----
Merge tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
Pull infiniband update from Roland Dreier:
"Main batch of InfiniBand/RDMA changes for 3.9:
- SRP error handling fixes from Bart Van Assche
- Implementation of memory windows for mlx4 from Shani Michaeli
- Lots of cxgb4 HW driver fixes from Vipul Pandya
- Make iSER work for virtual functions, other fixes from Or Gerlitz
- Fix for bug in qib HW driver from Mike Marciniszyn
- IPoIB fixes from me, Itai Garbi, Shlomo Pongratz, Yan Burman
- Various cleanups and warning fixes from Julia Lawall, Paul Bolle,
Wei Yongjun"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband: (41 commits)
IB/mlx4: Advertise MW support
IB/mlx4: Support memory window binding
mlx4: Implement memory windows allocation and deallocation
mlx4_core: Enable memory windows in {INIT, QUERY}_HCA
mlx4_core: Disable memory windows for virtual functions
IPoIB: Free ipoib neigh on path record failure so path rec queries are retried
IB/srp: Fail I/O requests if the transport is offline
IB/srp: Avoid endless SCSI error handling loop
IB/srp: Avoid sending a task management function needlessly
IB/srp: Track connection state properly
IB/mlx4: Remove redundant NULL check before kfree
IB/mlx4: Fix compiler warning about uninitialized 'vlan' variable
IB/mlx4: Convert is_xxx variables in build_mlx_header() to bool
IB/iser: Enable iser when FMRs are not supported
IB/iser: Avoid error prints on EAGAIN registration failures
IB/iser: Use proper define for the commands per LUN value advertised to SCSI ML
IB/uverbs: Implement memory windows support in uverbs
IB/core: Add "type 2" memory windows support
mlx4_core: Propagate MR deregistration failures to caller
mlx4_core: Rename MPT-related functions to have mpt_ prefix
...
We must only ever stop TX queues when they are full or the net device
is not 'ready' so far as the net core, and specifically the watchdog,
is concerned. Otherwise, the watchdog may fire *immediately* if no
packets have been added to the queue in the last 5 seconds.
The device is ready if all the following are true:
(a) It has a qdisc
(b) It is marked present
(c) It is running
(d) The link is reported up
(a) and (c) are normally true, and must not be changed by a driver.
(d) is under our control, but fake link changes may disturb userland.
This leaves (b). We already mark the device absent during reset
and self-test, but we need to do the same during MTU changes and ring
reallocation. We don't need to do this when the device is brought
down because then (c) is already false.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We assume that the mapping between DMA and virtual addresses is done
on whole pages, so we can find the page offset of an RX buffer using
the lower bits of the DMA address. However, swiotlb maps in units of
2K, breaking this assumption.
Add an explicit page_offset field to struct efx_rx_buffer.
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We may currently allocate two RX DMA buffers to a page, and only unmap
the page when the second is completed. We do not sync the first RX
buffer to be completed; this can result in packet loss or corruption
if the last RX buffer completed in a NAPI poll is the first in a page
and is not DMA-coherent. (In the middle of a NAPI poll, we will
handle the following RX completion and unmap the page *before* looking
at the content of the first buffer.)
Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
We were using the same variable for iterating two nested loops.
Reported-by: Tijs Van Buggenhout <tvb@able.be>
Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Implement MW allocation and deallocation in mlx4_core and mlx4_ib.
Pass down the enable bind flag when registering memory regions.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Add memory windows-related code to INIT_HCA and QUERY_HCA.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Do not enable memory windows allocation for virtual functions.
In addition, add a few safety checks, such as:
* Verifying the PD of a new MPT matches the VF.
* Making sure binding memory window isn't enabled for FMRs, and
that new memory windows are not FMR themselves.
Signed-off-by: Haggai Eran <haggaie@mellanox.com>
Signed-off-by: Shani Michaeli <shanim@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
Fix missing () & { }
Signed-off-by: Syam Sidhardhan <s.syam@samsung.com>
Acked-by: Olof Johansson <olof@lixom.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sparse warned about several functions that were unnecessarily global.
After making them static, discovered that several functions were actually never used.
Compile tested only.
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Acked-by: Eilon Greenstein <eilong@broadcomo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Commit ee873fda3b ("gianfar: Pack struct
gfar_priv_grp into three cachelines") moved the irq number and names
off into a separate struct and created accessors for them. However
it was never tested with NET_POLL enabled, and so some conversions
that were simply overlooked went undetected until now.
Make the netpoll ones also use the gfar_irq() accessors.
Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Claudiu Manoil <claudiu.manoil@freescale.com>
Cc: Jianhua Xie <jianhua.xie@freescale.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull powerpc updates from Benjamin Herrenschmidt:
"So from the depth of frozen Minnesota, here's the powerpc pull request
for 3.9. It has a few interesting highlights, in addition to the
usual bunch of bug fixes, minor updates, embedded device tree updates
and new boards:
- Hand tuned asm implementation of SHA1 (by Paulus & Michael
Ellerman)
- Support for Doorbell interrupts on Power8 (kind of fast
thread-thread IPIs) by Ian Munsie
- Long overdue cleanup of the way we handle relocation of our open
firmware trampoline (prom_init.c) on 64-bit by Anton Blanchard
- Support for saving/restoring & context switching the PPR (Processor
Priority Register) on server processors that support it. This
allows the kernel to preserve thread priorities established by
userspace. By Haren Myneni.
- DAWR (new watchpoint facility) support on Power8 by Michael Neuling
- Ability to change the DSCR (Data Stream Control Register) which
controls cache prefetching on a running process via ptrace by
Alexey Kardashevskiy
- Support for context switching the TAR register on Power8 (new
branch target register meant to be used by some new specific
userspace perf event interrupt facility which is yet to be enabled)
by Ian Munsie.
- Improve preservation of the CFAR register (which captures the
origin of a branch) on various exception conditions by Paulus.
- Move the Bestcomm DMA driver from arch powerpc to drivers/dma where
it belongs by Philippe De Muyter
- Support for Transactional Memory on Power8 by Michael Neuling
(based on original work by Matt Evans). For those curious about
the feature, the patch contains a pretty good description."
(See commit db8ff90702: "powerpc: Documentation for transactional
memory on powerpc" for the mentioned description added to the file
Documentation/powerpc/transactional_memory.txt)
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (140 commits)
powerpc/kexec: Disable hard IRQ before kexec
powerpc/85xx: l2sram - Add compatible string for BSC9131 platform
powerpc/85xx: bsc9131 - Correct typo in SDHC device node
powerpc/e500/qemu-e500: enable coreint
powerpc/mpic: allow coreint to be determined by MPIC version
powerpc/fsl_pci: Store the pci ctlr device ptr in the pci ctlr struct
powerpc/85xx: Board support for ppa8548
powerpc/fsl: remove extraneous DIU platform functions
arch/powerpc/platforms/85xx/p1022_ds.c: adjust duplicate test
powerpc: Documentation for transactional memory on powerpc
powerpc: Add transactional memory to pseries and ppc64 defconfigs
powerpc: Add config option for transactional memory
powerpc: Add transactional memory to POWER8 cpu features
powerpc: Add new transactional memory state to the signal context
powerpc: Hook in new transactional memory code
powerpc: Routines for FP/VSX/VMX unavailable during a transaction
powerpc: Add transactional memory unavaliable execption handler
powerpc: Add reclaim and recheckpoint functions for context switching transactional memory processes
powerpc: Add FP/VSX and VMX register load functions for transactional memory
powerpc: Add helper functions for transactional memory context switching
...
The mlx4_en driver allocates the number of objects for the CPU affinity
reverse-map based on the number of rx rings of the device. However,
mlx4_assign_eq() calls irq_cpu_rmap_add() as many times as IRQ's are
assigned to EQ's, which can be as large as mlx4_dev->caps.comp_pool. If
caps.comp_pool is larger than rx_ring_num we will eventually hit the
BUG_ON() in cpu_rmap_add().
Fix this problem by allocating space for the maximum number of CPU
affinity reverse-map objects we might want to add.
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The memory to hold the network device tx_cq is not being allocated with
the correct size in mlx4_en_init_netdev(). It should use MAX_TX_RINGS
instead of MAX_RX_RINGS. This can cause problems if the number of tx
rings being used is greater than MAX_RX_RINGS.
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Acked-by: Amir Vadai <amirv@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
commit 7f7d6c282 (net: fec: Ensure that initialization is done prior to
request_irq()) placed fec_ptp_init() into a point that ptp clock was not
available, which causes a division by zero in fec_ptp_start_cyclecounter():
[ 17.895723] Division by zero in kernel.
[ 17.899571] Backtrace:
[ 17.902094] [<80012564>] (dump_backtrace+0x0/0x10c) from [<8056deec>]
(dump_stack+0x18/0x1c)
[ 17.910539] r6:bfba8500 r5:8075c950 r4:bfba8000 r3:bfbd0000
[ 17.916284] [<8056ded4>] (dump_stack+0x0/0x1c) from [<80012688>]
(__div0+0x18/0x20)
[ 17.923968] [<80012670>] (__div0+0x0/0x20) from [<802829c4>] (Ldiv0+0x8/0x10)
[ 17.931140] [<80398534>] (fec_ptp_start_cyclecounter+0x0/0x110) from
[<80394f64>] (fec_restart+0x6c8/0x754)
[ 17.940898] [<8039489c>] (fec_restart+0x0/0x754) from [<803969a0>]
(fec_enet_adjust_link+0xdc/0x108)
[ 17.950046] [<803968c4>] (fec_enet_adjust_link+0x0/0x108) from [<80390bc4>]
(phy_state_machine+0x178/0x534)
...
Fix this by rearraging the code so that fec_ptp_init() is called only after
the clocks have been properly acquired.
Tested on both mx53 and mx6 platforms.
Reported-by: Jim Baxter <jim_baxter@mentor.com>
Signed-off-by: Fabio Estevam <fabio.estevam@freescale.com>
Signed-off-by: David S. Miller <davem@davemloft.net>