Commit Graph

590222 Commits

Author SHA1 Message Date
Qing Huang
a7c556546f RDS: fix endianness for dp_ack_seq
dp->dp_ack_seq is used in big endian format. We need to do the
big endianness conversion when we assign a value in host format
to it.

Signed-off-by: Qing Huang <qing.huang@oracle.com>
Signed-off-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-16 19:01:05 -04:00
Linus Torvalds
306a63bee1 dmaengine fixes for 4.6-rc4
Okay we some driver fixes piled up, so time to get them up.
 
 This time we have some odd fixes in hsu, edma, omap and xilinx.
 Usual fixes and nothing special
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXEnqKAAoJEHwUBw8lI4NH3ScP/RaxUP64cn4ttUM/G/13jWxs
 xpCtb25fRB7DcpqZpsAV84+NOGSJgJNOGKzWtfzu9aerUGcFiePwsv0V23ocC6oK
 OxLDHQdjR5WXXchTxMqugWzNb0NdaMXDA5hfM1CnStMj4ifZY5r9/l/0ndlrSgOf
 A1rZhg4TlbM1G1xnMYeDPwXRoHDs7k3eHM1XC9y5B50IPqMCsDwX4SODHX2vMu0H
 6MbcturjfNN3hjr/ciTCMRYcab5q3Sv7UP31KSLA8LGzNojntj3shy2jLU6olQdn
 IT0nUA+K7eYk/4Q2HGIo9LhM33XLBuCSLKca2H6/xt8R6J0J6ASwbQqa1k4A1aSv
 n/jeQzlvgkGZA0Spat5Qms24NjtSbEllPcMkkoXXQsZctk4Hm2PRAOxVDFdXPt3C
 0DMXHPnVhkOvamXqZYsvy/bqqUKDrHQvxW5kqtpn6W+B7bx5PPmPFRHtFWcbatsY
 5y+D+3n0RNZqtKQ80M/3pRqpDh6MtrdfO9kuakS0BqpZZEFzZ5F0c2MzO93K2+3e
 lVzj1iwTXcYJTF/DTuPOKmu/g6BC3oqaLNXDlsUEODiczzZwcbEvDxa4Z1otzQn3
 JJ0diVqQRAbIdvVSpSUPMwaqLn+s5r0Uz9tLRNdARNI22a6YgnOK3MEyq+9+OMxl
 rEA4ntRPxD2YzN7KfRY1
 =bSYQ
 -----END PGP SIGNATURE-----

Merge tag 'dmaengine-fix-4.6-rc4' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fixes from Vinod Koul:
 "This time we have some odd fixes in hsu, edma, omap and xilinx.

  Usual fixes and nothing special"

* tag 'dmaengine-fix-4.6-rc4' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: dw: fix master selection
  dmaengine: edma: special case slot limit workaround
  dmaengine: edma: Remove dynamic TPTC power management feature
  dmaengine: vdma: don't crash when bad channel is requested
  dmaengine: omap-dma: Do not suppress interrupts for memcpy
  dmaengine: omap-dma: Fix polled channel completion detection and handling
  dmaengine: hsu: correct use of channel status register
  dmaengine: hsu: correct residue calculation of active descriptor
  dmaengine: hsu: set HSU_CH_MTSR to memory width
2016-04-16 15:52:38 -07:00
Linus Torvalds
ac82a57aff Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixlet from Ingo Molnar:
 "Fixes a build warning on certain Kconfig combinations"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/lockdep: Fix print_collision() unused warning
2016-04-16 15:43:19 -07:00
Linus Torvalds
e2f50c5c6c Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Ingo Molnar:
 "An arm64 boot crash fix"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  efi/arm64: Don't apply MEMBLOCK_NOMAP to UEFI memory map mapping
2016-04-16 15:37:05 -07:00
David S. Miller
ec9dcd3507 Merge branch 'w5100-spi-and-w5200-support'
Akinobu Mita says:

====================
net: w5100: add support W5100/W5200 for SPI interface

This series add support for Wiznet W5100 and W5200 for SPI interface.

We can easily find the ethernet modules and shield for Arduino with
these chips for purchase.  I've tested them with BeagleBone.

Wiznet W5100 for mmio access has already supported by w5100 driver.

In order to share the code between mmio mode and SPI mode, this series
firstly adds ability to support another register access interface to
the existing w5100 driver.  This ground work also requires to introduce
workqueue and threaded irq because SPI transfers are callable only from
contexts that can sleep unlike mmio access.

The latter part of this series adds w5100-spi driver which actually
support W5100 and W5200 for SPI interface.  Supporting W5100 is
straight forward because it only required to add a register access
interface by the SPI transfer.  W5100 and W5200 have similar memory
map which justifies adding W5200 support to w5100 driver.

* Changes from v2 to v3
- Add comment for reg_lock
- Add ability to allocate ops specific data structure
- Allocate w5200 ops specific data structure to put DMA-safe buffer
- Add missing chip_id assignment for w5100_*_ops

* Changes from v1 to v2
- Use a plain single pointer instead of SKB queue, spotted by David S. Miller
- Correct timeout period in w5100_command
- Use spi_write_then_read instead of spi_write which needs DMA-safe buffer
- Support W5200
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-16 18:30:27 -04:00
Akinobu Mita
0c165ff2d8 net: w5100: support W5200
This adds support for W5200 chip.

W5100 and W5200 have similar memory map although some of their offsets
are different.  The register access sequences between them are different
but w5100 driver has abstraction layer for difference bus interface
modes so it is easy to add W5200 support to w5100 and w5100-spi drivers.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Sinkovsky <msink@permonline.ru>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-16 18:30:27 -04:00
Akinobu Mita
630cf09751 net: w5100: support SPI interface mode
This adds new w5100-spi driver which shares the bus interface
independent code with existing w5100 driver.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Sinkovsky <msink@permonline.ru>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-16 18:30:27 -04:00
Akinobu Mita
bf2c6b90b3 net: w5100: enable to support sleepable register access interface
SPI transfer routines are callable only from contexts that can sleep.

This adds ability to tell the core driver that the interface mode
cannot access w5100 register on atomic contexts.  In this case,
workqueue and threaded irq are required.

This also corrects timeout period waiting for command register to be
automatically cleared because the latency of the register access with
SPI transfer can be interfered by other contexts.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Sinkovsky <msink@permonline.ru>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-16 18:30:27 -04:00
Akinobu Mita
850576cfed net: w5100: add ability to support other bus interface
The w5100 driver currently only supports direct and indirect bus
interface mode which use MMIO space for accessing w5100 registers.

In order to support SPI interface mode which is supported by W5100 chip,
this makes the bus interface abstraction layer more generic so that
separated w5100-spi driver can use w5100 driver as core module.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Sinkovsky <msink@permonline.ru>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-16 18:30:27 -04:00
Akinobu Mita
d6586d2ef4 net: w5100: move mmiowb into register access callbacks
Instead of sprinkle mmiowb over the driver code, move it into primary
register write callbacks. (w5100_write, w5100_write16, w5100_writebuf)

This is a preparation for supporting SPI interface which doesn't use
MMIO for accessing w5100 registers.

Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Cc: Mike Sinkovsky <msink@permonline.ru>
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-16 18:30:26 -04:00
Hannes Frederic Sowa
544a773a01 vxlan: reduce usage of synchronize_net in ndo_stop
We only need to do the synchronize_net dance once for both, ipv4 and
ipv6 sockets, thus removing one synchronize_net in case both sockets get
dismantled.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-16 18:23:23 -04:00
Hannes Frederic Sowa
0412bd931f vxlan: synchronously and race-free destruction of vxlan sockets
Due to the fact that the udp socket is destructed asynchronously in a
work queue, we have some nondeterministic behavior during shutdown of
vxlan tunnels and creating new ones. Fix this by keeping the destruction
process synchronous in regards to the user space process so IFF_UP can
be reliably set.

udp_tunnel_sock_release destroys vs->sock->sk if reference counter
indicates so. We expect to have the same lifetime of vxlan_sock and
vxlan_sock->sock->sk even in fast paths with only rcu locks held. So
only destruct the whole socket after we can be sure it cannot be found
by searching vxlan_net->sock_list.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jiri Benc <jbenc@redhat.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-16 18:23:01 -04:00
Vinod Koul
956e6c8e18 Merge branch 'fix/edma' into fixes 2016-04-16 22:52:03 +05:30
Vinod Koul
1cc3334e2e Merge branch 'fix/xilinx' into fixes 2016-04-16 22:45:26 +05:30
Vinod Koul
4bd613596b Merge branch 'fix/omap' into fixes 2016-04-16 22:45:17 +05:30
Vinod Koul
09c505ced3 Merge branch 'fix/hsu' into fixes 2016-04-16 22:44:32 +05:30
Daniel Borkmann
9241e2df4f vlan: pull on __vlan_insert_tag error path and fix csum correction
When __vlan_insert_tag() fails from skb_vlan_push() path due to the
skb_cow_head(), we need to undo the __skb_push() in the error path
as well that was done earlier to move skb->data pointer to mac header.

Moreover, I noticed that when in the non-error path the __skb_pull()
is done and the original offset to mac header was non-zero, we fixup
from a wrong skb->data offset in the checksum complete processing.

So the skb_postpush_rcsum() really needs to be done before __skb_pull()
where skb->data still points to the mac header start and thus operates
under the same conditions as in __vlan_insert_tag().

Fixes: 93515d53b1 ("net: move vlan pop/push functions into common code")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 23:20:11 -04:00
wangweidong
f48256efed phy: make some bits preserved while setup forced mode
When tested the PHY SGMII Loopback:
1.set the LOOPBACK bit,
2.set the autoneg to AUTONEG_DISABLE, it calls the
genphy_setup_forced which will clear the bit.

The BMCR_LOOPBACK bit should be preserved.

As Florian pointed out that other bits should be preserved too.
So I make the BMCR_ISOLATE and BMCR_PDOWN as well.

Signed-off-by: Weidong Wang <wangweidong1@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 20:10:00 -04:00
Linus Torvalds
2e57259913 Merge branch 'for-linus' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A few fixes for the current series. This contains:

   - Two fixes for NVMe:

     One fixes a reset race that can be triggered by repeated
     insert/removal of the module.

     The other fixes an issue on some platforms, where we get probe
     timeouts since legacy interrupts isn't working.  This used not to
     be a problem since we had the worker thread poll for completions,
     but since that was killed off, it means those poor souls can't
     successfully probe their NVMe device.  Use a proper IRQ check and
     probe (msi-x -> msi ->legacy), like most other drivers to work
     around this.  Both from Keith.

   - A loop corruption issue with offset in iters, from Ming Lei.

   - A fix for not having the partition stat per cpu ref count
     initialized before sending out the KOBJ_ADD, which could cause user
     space to access the counter prior to initialization.  Also from
     Ming Lei.

   - A fix for using the wrong congestion state, from Kaixu Xia"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: loop: fix filesystem corruption in case of aio/dio
  NVMe: Always use MSI/MSI-x interrupts
  NVMe: Fix reset/remove race
  writeback: fix the wrong congested state variable definition
  block: partition: initialize percpuref before sending out KOBJ_ADD
2016-04-15 15:44:10 -07:00
Linus Torvalds
f3c9a1abbe Merge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Ross Zwisler:
 "Two fixes:

   - Fix memcpy_from_pmem() to fallback to memcpy() for architectures
     where CONFIG_ARCH_HAS_PMEM_API=n.

   - Add a comment explaining why we write data twice when clearing
     poison in pmem_do_bvec().

  This has passed a boot test on an X86_32 config, which was the
  architecture where issue #1 above was first noticed"

Dan Williams adds:
 "We're giving this multi-maintainer setup a shot, so expect libnvdimm
  pull requests from either Ross or I going forward"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm, pmem: clarify the write+clear_poison+write flow
  pmem: fix BUG() error in pmem.h:48 on X86_32
2016-04-15 15:34:27 -07:00
Linus Torvalds
29dde7c25a One MTD fix for v4.6-rc4:
In the v4.4 cycle, we relaxed the requirement for assigning mtd->owner, but we
 didn't remove this error case. It's hit only by drivers that are both:
 
 (a) using nand_scan() directly and
 (b) built as modules
 
 We haven't seen explicit complaints about this (most use cases don't fit one or
 both of the above), but we should definitely not be BUG()'ing here.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXEWZUAAoJEFySrpd9RFgtuaYQAIjfmoefoDJuIw3QQE5tLNMc
 xTj07f+V/Nzhats69zvjx4LNFF6R339IBFNq7WK+q7l7VbW6/XSa8XaQkgpO+wMI
 Sqzkty4YIxQGucITVvauv82JW0K421XVUBAzeNt+4oT72RvMnXhOPka1LQXf3Ezh
 vGb7ZGea7ZYuu060OQAGxJlrlDA5x7yhsvs3hpOT9hjx0jXr0JZXmohvFRqSiXqz
 Rx60rmhpl37WJg46OX/LYRAF34StF8btDZsQVdpY75T9QtxD3vWvvlMHfJ4cQXRI
 L3qUIGGj03tD0/4nf6/p+WMjxgctDqROVxWzLHOvTa+xIFR7iU8yra09fTbylOGL
 C3I0B6kW05YJrY3TegL8EhNbotciC9SeCtbeRwbndyO1Jox88xGHovwWFN5++3Q+
 4ahiP57cV/py1L2BgeYdPmHES6T4GLH/4y2MaMZr4hfGVxHbKN1/JkLp9cNqCSzZ
 MtnEevbHYq4eN1rYULQCzLRxawhYjeY77cMYip6Ynu24UWUJmfdaroJncfmmBOhP
 7NvWjtxpsjonrCVOdb4aq9UPTc3PxlyR4kZyzg7+Vx4eOByK8hFkfjckehJ64DCs
 ac+r5d2LmaLqAMIvQt4ApwwzRzj4ST6CEWBEyKrvwwF4B4nHhkbf9Pf4IrzGmaao
 W4dws70o9WUwGIv0Rw3D
 =wxaC
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20160415' of git://git.infradead.org/linux-mtd

Pull MTD fix from Brian Norris:
 "One MTD fix for v4.6-rc4:

  In the v4.4 cycle, we relaxed the requirement for assigning
  mtd->owner, but we didn't remove this error case.  It's hit only
  by drivers that are both:

   (a) using nand_scan() directly
  and
   (b) built as modules

  We haven't seen explicit complaints about this (most use cases don't
  fit one or both of the above), but we should definitely not be
  BUG()'ing here"

* tag 'for-linus-20160415' of git://git.infradead.org/linux-mtd:
  mtd: nand: Drop mtd.owner requirement in nand_scan
2016-04-15 15:25:09 -07:00
Linus Torvalds
2fffad1287 MMC core:
- Restore similar old behaviour when assigning mmcblk device indexes
 
 MMC host:
  - tegra: Disable UHS-I modes for Tegra124 to fix regression
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXELVaAAoJEP4mhCVzWIwpvCQP/i1P0tPTLKvHmuKMnCbrIVN/
 GWbUNjcui4YnOBufJDFZ3QnHVWpnSDpa+dGmzsO0Vpi7OSRWmvgWwwqwspSZUcp/
 2c//US3R7iiVQ2ww2Kf7IErmi2CqHxL+ItWtZO3MaabiC0fCPPtjBRnfbftTFBJ5
 5zcHeXJTkBF8rXtaz6URwACvlGsPluvVutxLRsEoHDQU0rsTgysbzTr2fu+M7IDI
 jE3Q0U5xdIlT8YqmsQTUwdWF12STlBJfHwsGxOSxZTOnLOiNwoSoDftXvA+qH27V
 C/gJUP8Di6sgYKPY7zqdN9eF0DAnK6n4QSpyzFEG9U2zNvW6enXiv8RBWRhfGjVp
 VW/1XE8DE//l+WCxEQhjiFcBDAJ8ahfcdbsp7Ngd5t8WJVydL0dGpHLolCWwABMl
 YMuRaH2eJznxazn0CRrIxRO9OIISandj3G5COkOOaar+2YzIWUotrk+q+swyq2x/
 5hgSyaxYm27J7RUTx31YtQxCZownrNvyQAwj3omoe2ffWTENJeAobkBiLrzbSuaL
 2TxhRJig75vOcFN6cZI0dhHR47j6uYVYJ62aWi3OzwpGR5Wz7h4zkVN5GRmNYBdE
 JB5neGiEeCDnvbZ5tGHR3+ZNnDjFthAK88NagqknZFlnm6VOIsj4dCa2gBvwc1xu
 wOobQCSH7JuQqS7JHjuy
 =nnxN
 -----END PGP SIGNATURE-----

Merge tag 'mmc-v4.6-rc3' of git://git.linaro.org/people/ulf.hansson/mmc

Pull MMC fixes from Ulf Hansson:
 "Here are a couple of mmc fixes intended for v4.6 rc4.

  Regarding the fix for the regression about mmcblk device indexes.  The
  approach taken to solve the problem seems to be good enough.  There
  were some discussions around the solution, but it seems like people
  were happy about it in the end.

  MMC core:
   - Restore similar old behaviour when assigning mmcblk device indexes

  MMC host:
   - tegra: Disable UHS-I modes for Tegra124 to fix regression"

* tag 'mmc-v4.6-rc3' of git://git.linaro.org/people/ulf.hansson/mmc:
  mmc: tegra: Disable UHS-I modes for Tegra124
  mmc: block: Use the mmc host device index as the mmcblk device index
2016-04-15 15:10:32 -07:00
Linus Torvalds
ab5f9ebac1 Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
 "This contains fixes for exynos, amdgpu, radeon, i915 and qxl.

  It also contains some fixes to the core drm edid parser.

  qxl:
   - fix for a cursor hotspot issue

  radeon:
   - some MST fixes that I've been running locally and make my monitor a
     bit happier

  exynos:
   - fix some regressions and build fixes

  amdgpu:
   - a couple of small fixes

  i915:
   - two DP MST fixes and a couple of other regression fixes

  Nothing too out of the ordinary or surprising at this point"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/exynos: Use VIDEO_SAMSUNG_S5P_G2D=n as G2D Kconfig dependency
  drm/exynos: fix a warning message
  drm/exynos: mic: fix an error code
  drm/exynos: fimd: fix broken dp_clock control
  drm/exynos: build fbdev code conditionally
  drm/exynos: fix adjusted_mode pointer in exynos_plane_mode_set
  drm/exynos: fix error handling in exynos_drm_subdrv_open
  drm/amd/amdgpu: fix irq domain remove for tonga ih
  drm/i915: fix deadlock on lid open
  drm/radeon: use helper for mst connector dpms.
  drm/radeon/mst: port some MST setup code from DAL.
  drm/amdgpu: add invisible pin size statistic
  drm/edid: Fix DMT 1024x768@43Hz (interlaced) timings
  drm/i915: Exit cherryview_irq_handler() after one pass
  drm/i915: Call intel_dp_mst_resume() before resuming displays
  drm/i915: Fix race condition in intel_dp_destroy_mst_connector()
  drm/edid: Fix parsing of EDID 1.4 Established Timings III descriptor
  drm/edid: Fix EDID Established Timings I and II
  drm/qxl: fix cursor position with non-zero hotspot
2016-04-15 14:59:28 -07:00
Linus Torvalds
60ea7bb007 Merge branch 'parisc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc ftrace fixes from Helge Deller:
 "This is (most likely) the last pull request for v4.6 for the parisc
  architecture.

  It fixes the FTRACE feature for parisc, which is horribly broken since
   quite some time and doesn't even compile.  This patch just fixes the
  bare minimum (it actually removes more lines than it adds), so that
  the function tracer works again on 32- and 64bit kernels.

  I've queued up additional patches on top of this patch which e.g. add
  the syscall tracer, but those have to wait for the merge window for
  v4.7."

* 'parisc-4.6-4' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Fix ftrace function tracer
2016-04-15 14:51:45 -07:00
David S. Miller
4e811b1e11 Merge branch 'sctp-diag'
Xin Long says:

====================
sctp: support sctp_diag in kernel

This patchset will add sctp_diag module to implement diag interface on
sctp in kernel.

For a listening sctp endpoint, we will just dump it's ep info.
For a sctp connection, we will the assoc info and it's ep info.

The ss dump will looks like:

[iproute2]# ./misc/ss --sctp  -n -l
State      Recv-Q Send-Q   Local Address:Port       Peer Address:Port
LISTEN     0      128      172.16.254.254:8888      *:*
LISTEN     0      5        127.0.0.1:1234           *:*
LISTEN     0      5        127.0.0.1:1234           *:*
  - ESTAB  0      0        127.0.0.1%lo:1234        127.0.0.1:4321
LISTEN     0      128      172.16.254.254:8888      *:*
  - ESTAB  0      0        172.16.254.254%eth1:8888 172.16.253.253:8888
  - ESTAB  0      0        172.16.254.254%eth1:8888 172.16.1.1:8888
  - ESTAB  0      0        172.16.254.254%eth1:8888 172.16.1.2:8888
  - ESTAB  0      0        172.16.254.254%eth1:8888 172.16.2.1:8888
  - ESTAB  0      0        172.16.254.254%eth1:8888 172.16.2.2:8888
  - ESTAB  0      0        172.16.254.254%eth1:8888 172.16.3.1:8888
  - ESTAB  0      0        172.16.254.254%eth1:8888 172.16.3.2:8888
LISTEN     0      0        127.0.0.1:4321           *:*
  - ESTAB  0      0        127.0.0.1%lo:4321        127.0.0.1:1234

The entries with '- ESTAB' are the assocs, some of them may belong to
the same endpoint. So we will dump the parent endpoint first, like the
entry with 'LISTEN'. then dump the assocs. ep and assocs entries will
be dumped in right order so that ss can show them in tree format easily.

Besides, this patchset also simplifies sctp proc codes, cause it has
some similar codes with sctp diag in sctp transport traversal.

v1->v2:
  1. inet_diag_get_handler needs to return it as const.
  2. merge 5/7 into 2/7 of v1.

v2->v3:
  do some improvements and fixes in patch 1-4, see the details in
  each patch's comment.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:29:37 -04:00
Xin Long
53fa10369c sctp: fix some rhashtable functions using in sctp proc/diag
When rhashtable_walk_init return err, no release function should be
called, and when rhashtable_walk_start return err, we should only invoke
rhashtable_walk_exit to release the source.

But now when sctp_transport_walk_start return err, we just call
rhashtable_walk_stop/exit, and never care about if rhashtable_walk_init
or start return err, which is so bad.

We will fix it by calling rhashtable_walk_exit if rhashtable_walk_start
return err in sctp_transport_walk_start, and if sctp_transport_walk_start
return err, we do not need to call sctp_transport_walk_stop any more.

For sctp proc, we will use 'iter->start_fail' to decide if we will call
rhashtable_walk_stop/exit.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:29:37 -04:00
Xin Long
b5e2f4e699 sctp: merge the seq_start/next/exits in remaddrs and assocs
In sctp proc, these three functions in remaddrs and assocs are the
same. we should merge them into one.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:29:36 -04:00
Xin Long
8f840e47f1 sctp: add the sctp_diag.c file
This one will implement all the interface of inet_diag, inet_diag_handler.
which includes sctp_diag_dump, sctp_diag_dump_one and sctp_diag_get_info.

It will work as a module, and register inet_diag_handler when loading.

v2->v3:
- fix the mistake in inet_assoc_attr_size().

- change inet_diag_msg_laddrs_fill() name to inet_diag_msg_sctpladdrs_fill.

- change inet_diag_msg_paddrs_fill() name to inet_diag_msg_sctpaddrs_fill.

- add inet_diag_msg_sctpinfo_fill() to make asoc/ep fill code clearer.

- add inet_diag_msg_sctpasoc_fill() to make asoc fill code clearer.

- merge inet_asoc_diag_fill() and inet_ep_diag_fill() to
  inet_sctp_diag_fill().

- call sctp_diag_get_info() directly, instead by handler, cause the caller
  is in the same file with it.

- call lock_sock in sctp_tsp_dump_one() to make sure we call get sctp info
  safely.

- after lock_sock(sk), we should check sk != assoc->base.sk.

- change mem[SK_MEMINFO_WMEM_ALLOC] to asoc->sndbuf_used for asoc dump when
  asoc->ep->sndbuf_policy is set. don't use INET_DIAG_MEMINFO attr any more.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:29:36 -04:00
Xin Long
cb2050a7b8 sctp: export some functions for sctp_diag in inet_diag
inet_diag_msg_common_fill is used to fill the diag msg common info,
we need to use it in sctp_diag as well, so export it.

inet_diag_msg_attrs_fill is used to fill some common attrs info between
sctp diag and tcp diag.

v2->v3:
- do not need to define and export inet_diag_get_handler any more.
  cause all the functions in it are in sctp_diag.ko, we just call
  them in sctp_diag.ko.

- add inet_diag_msg_attrs_fill to make codes clear.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:29:36 -04:00
Xin Long
626d16f50f sctp: export some apis or variables for sctp_diag and reuse some for proc
For some main variables in sctp.ko, we couldn't export it to other modules,
so we have to define some api to access them.

It will include sctp transport and endpoint's traversal.

There are some transport traversal functions for sctp_diag, we can also
use it for sctp_proc. cause they have the similar situation to traversal
transport.

v2->v3:
- rhashtable_walk_init need the parameter gfp, because of recent upstrem
  update

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:29:36 -04:00
Xin Long
52c52a61a3 sctp: add sctp_info dump api for sctp_diag
sctp_diag will dump some important details of sctp's assoc or ep, we use
sctp_info to describe them,  sctp_get_sctp_info to get them, and export
it to sctp_diag.ko.

v2->v3:
- we will not use list_for_each_safe in sctp_get_sctp_info, cause
  all the callers of it will use lock_sock.

- fix the holes in struct sctp_info with __reserved* field.
  because sctp_diag is a new feature, and sctp_info is just for now,
  it may be changed in the future.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:29:35 -04:00
Andrew Goodbody
cfe2556001 cpsw: Prevent NUll pointer dereference with two PHYs
Adding a 2nd PHY to cpsw results in a NULL pointer dereference
as below. Fix by maintaining a reference to each PHY node in slave
struct instead of a single reference in the priv struct which was
overwritten by the 2nd PHY.

[   17.870933] Unable to handle kernel NULL pointer dereference at virtual address 00000180
[   17.879557] pgd = dc8bc000
[   17.882514] [00000180] *pgd=9c882831, *pte=00000000, *ppte=00000000
[   17.889213] Internal error: Oops: 17 [#1] ARM
[   17.893838] Modules linked in:
[   17.897102] CPU: 0 PID: 1657 Comm: connmand Not tainted 4.5.0-ge463dfb-dirty #11
[   17.904947] Hardware name: Cambrionix whippet
[   17.909576] task: dc859240 ti: dc968000 task.ti: dc968000
[   17.915339] PC is at phy_attached_print+0x18/0x8c
[   17.920339] LR is at phy_attached_info+0x14/0x18
[   17.925247] pc : [<c042baec>]    lr : [<c042bb74>]    psr: 600f0113
[   17.925247] sp : dc969cf8  ip : dc969d28  fp : dc969d18
[   17.937425] r10: dda7a400  r9 : 00000000  r8 : 00000000
[   17.942971] r7 : 00000001  r6 : ddb00480  r5 : ddb8cb34  r4 : 00000000
[   17.949898] r3 : c0954cc0  r2 : c09562b0  r1 : 00000000  r0 : 00000000
[   17.956829] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
[   17.964401] Control: 10c5387d  Table: 9c8bc019  DAC: 00000051
[   17.970500] Process connmand (pid: 1657, stack limit = 0xdc968210)
[   17.977059] Stack: (0xdc969cf8 to 0xdc96a000)
[   17.981692] 9ce0:                                                       dc969d28 dc969d08
[   17.990386] 9d00: c038f9bc c038f6b4 ddb00480 dc969d34 dc969d28 c042bb74 c042bae4 00000000
[   17.999080] 9d20: c09562b0 c0954cc0 dc969d5c dc969d38 c043ebfc c042bb6c 00000007 00000003
[   18.007773] 9d40: ddb00000 ddb8cb58 ddb00480 00000001 dc969dec dc969d60 c0441614 c043ea68
[   18.016465] 9d60: 00000000 00000003 00000000 fffffff4 dc969df4 0000000d 00000000 00000000
[   18.025159] 9d80: dc969db4 dc969d90 c005dc08 c05839e0 dc969df4 0000000d ddb00000 00001002
[   18.033851] 9da0: 00000000 00000000 dc969dcc dc969db8 c005ddf4 c005dbc8 00000000 00000118
[   18.042544] 9dc0: dc969dec dc969dd0 ddb00000 c06db27c ffff9003 00001002 00000000 00000000
[   18.051237] 9de0: dc969e0c dc969df0 c057c88c c04410dc dc969e0c ddb00000 ddb00000 00000001
[   18.059930] 9e00: dc969e34 dc969e10 c057cb44 c057c7d8 ddb00000 ddb00138 00001002 beaeda20
[   18.068622] 9e20: 00000000 00000000 dc969e5c dc969e38 c057cc28 c057cac0 00000000 dc969e80
[   18.077315] 9e40: dda7a40c beaeda20 00000000 00000000 dc969ecc dc969e60 c05e36d0 c057cc14
[   18.086007] 9e60: dc969e84 00000051 beaeda20 00000000 dda7a40c 00000014 ddb00000 00008914
[   18.094699] 9e80: 30687465 00000000 00000000 00000000 00009003 00000000 00000000 00000000
[   18.103391] 9ea0: 00001002 00008914 dd257ae0 beaeda20 c098a428 beaeda20 00000011 00000000
[   18.112084] 9ec0: dc969edc dc969ed0 c05e4e54 c05e3030 dc969efc dc969ee0 c055f5ac c05e4cc4
[   18.120777] 9ee0: beaeda20 dd257ae0 dc8ab4c0 00008914 dc969f7c dc969f00 c010b388 c055f45c
[   18.129471] 9f00: c071ca40 dd257ac0 c00165e8 dc968000 dc969f3c dc969f20 dc969f64 dc969f28
[   18.138164] 9f20: c0115708 c0683ec8 dd257ac0 dd257ac0 dc969f74 dc969f40 c055f350 c00fc66c
[   18.146857] 9f40: dd82e4d0 00000011 00000000 00080000 dd257ac0 00000000 dc8ab4c0 dc8ab4c0
[   18.155550] 9f60: 00008914 beaeda20 00000011 00000000 dc969fa4 dc969f80 c010bc34 c010b2fc
[   18.164242] 9f80: 00000000 00000011 00000002 00000036 c00165e8 dc968000 00000000 dc969fa8
[   18.172935] 9fa0: c00163e0 c010bbcc 00000000 00000011 00000011 00008914 beaeda20 00009003
[   18.181628] 9fc0: 00000000 00000011 00000002 00000036 00081018 00000001 00000000 beaedc10
[   18.190320] 9fe0: 00083188 beaeda1c 00043a5d b6d29c0c 600b0010 00000011 00000000 00000000
[   18.198989] Backtrace:
[   18.201621] [<c042bad8>] (phy_attached_print) from [<c042bb74>] (phy_attached_info+0x14/0x18)
[   18.210664]  r3:c0954cc0 r2:c09562b0 r1:00000000
[   18.215588]  r4:ddb00480
[   18.218322] [<c042bb60>] (phy_attached_info) from [<c043ebfc>] (cpsw_slave_open+0x1a0/0x280)
[   18.227293] [<c043ea5c>] (cpsw_slave_open) from [<c0441614>] (cpsw_ndo_open+0x544/0x674)
[   18.235874]  r7:00000001 r6:ddb00480 r5:ddb8cb58 r4:ddb00000
[   18.241944] [<c04410d0>] (cpsw_ndo_open) from [<c057c88c>] (__dev_open+0xc0/0x128)
[   18.249972]  r9:00000000 r8:00000000 r7:00001002 r6:ffff9003 r5:c06db27c r4:ddb00000
[   18.258255] [<c057c7cc>] (__dev_open) from [<c057cb44>] (__dev_change_flags+0x90/0x154)
[   18.266745]  r5:00000001 r4:ddb00000
[   18.270575] [<c057cab4>] (__dev_change_flags) from [<c057cc28>] (dev_change_flags+0x20/0x50)
[   18.279523]  r9:00000000 r8:00000000 r7:beaeda20 r6:00001002 r5:ddb00138 r4:ddb00000
[   18.287811] [<c057cc08>] (dev_change_flags) from [<c05e36d0>] (devinet_ioctl+0x6ac/0x76c)
[   18.296483]  r9:00000000 r8:00000000 r7:beaeda20 r6:dda7a40c r5:dc969e80 r4:00000000
[   18.304762] [<c05e3024>] (devinet_ioctl) from [<c05e4e54>] (inet_ioctl+0x19c/0x1c8)
[   18.312882]  r10:00000000 r9:00000011 r8:beaeda20 r7:c098a428 r6:beaeda20 r5:dd257ae0
[   18.321235]  r4:00008914
[   18.323956] [<c05e4cb8>] (inet_ioctl) from [<c055f5ac>] (sock_ioctl+0x15c/0x2d8)
[   18.331829] [<c055f450>] (sock_ioctl) from [<c010b388>] (do_vfs_ioctl+0x98/0x8d0)
[   18.339765]  r7:00008914 r6:dc8ab4c0 r5:dd257ae0 r4:beaeda20
[   18.345822] [<c010b2f0>] (do_vfs_ioctl) from [<c010bc34>] (SyS_ioctl+0x74/0x84)
[   18.353573]  r10:00000000 r9:00000011 r8:beaeda20 r7:00008914 r6:dc8ab4c0 r5:dc8ab4c0
[   18.361924]  r4:00000000
[   18.364653] [<c010bbc0>] (SyS_ioctl) from [<c00163e0>] (ret_fast_syscall+0x0/0x3c)
[   18.372682]  r9:dc968000 r8:c00165e8 r7:00000036 r6:00000002 r5:00000011 r4:00000000
[   18.380960] Code: e92dd810 e24cb010 e24dd010 e59b4004 (e5902180)
[   18.387580] ---[ end trace c80529466223f3f3 ]---

Signed-off-by: Andrew Goodbody <andrew.goodbody@cambrionix.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:24:37 -04:00
Marcelo Ricardo Leitner
311b21774f sctp: simplify sk_receive_queue locking
SCTP already serializes access to rcvbuf through its sock lock:
sctp_recvmsg takes it right in the start and release at the end, while
rx path will also take the lock before doing any socket processing. On
sctp_rcv() it will check if there is an user using the socket and, if
there is, it will queue incoming packets to the backlog. The backlog
processing will do the same. Even timers will do such check and
re-schedule if an user is using the socket.

Simplifying this will allow us to remove sctp_skb_list_tail and get ride
of some expensive lockings.  The lists that it is used on are also
mangled with functions like __skb_queue_tail and __skb_unlink in the
same context, like on sctp_ulpq_tail_event() and sctp_clear_pd().
sctp_close() will also purge those while using only the sock lock.

Therefore the lockings performed by sctp_skb_list_tail() are not
necessary. This patch removes this function and replaces its calls with
just skb_queue_splice_tail_init() instead.

The biggest gain is at sctp_ulpq_tail_event(), because the events always
contain a list, even if it's queueing a single skb and this was
triggering expensive calls to spin_lock_irqsave/_irqrestore for every
data chunk received.

As SCTP will deliver each data chunk on a corresponding recvmsg, the
more effective the change will be.
Before this patch, with chunks with 30 bytes:
netperf -t SCTP_STREAM -H 192.168.1.2 -cC -l 60 -- -m 30 -S 400000
400000 -s 400000 400000
on a 10Gbit link with 1500 MTU:

SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

425984 425984     30    60.00       137.45   7.34     7.36     52.504  52.608

With it:

SCTP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.1 () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % S      us/KB   us/KB

425984 425984     30    60.00       179.10   7.97     6.70     43.740  36.788

Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:22:20 -04:00
David S. Miller
936d4b41b0 Merge branch 'mlx5_ifc-updates'
Saeed Mahameed says:

====================
mlx5_core: mlx5_ifc updates

This series include mlx5_core updates for both net-next and rdma
trees for 4.7 kernel cycle. This is the only shared code planned
for 4.7 between rdma and net trees. Hopefully, this will prevent
future conflicts when merging between ib-next and net-next once
4.7 cycle is over and merge window is opened.

Both Mellanox rdma and net submissions will proceed once this series
is applied into both trees.

Future shared code will be sent to both maintainers as pull requests
from Mellanox's kernel.org tree.

We have included all the maintainers of respective drivers.
Kindly review the change and let us know in case of any review comments.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:21:10 -04:00
Saeed Mahameed
7d5e14237a net/mlx5: Update mlx5_ifc hardware features
Adding the needed mlx5_ifc hardware bits and structs
for the following feature:

* Add vport to steering commands for SRIOV ACL support
* Add mlcr, pcmr and mcia registers for dump module EEPROM
* Add support for FCS, baeacon led and disable_link bits to
  hca caps
* Add CQE period mode bit in  CQ context for CQE based CQ
  moderation support
* Add umr SQ bit for fragmented memory registration
* Add needed bits and caps for Striding RQ support

Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:21:10 -04:00
Tariq Toukan
e1c9c62b9a net/mlx5: Fix mlx5 ifc cmd_hca_cap bad offsets
All reserved fields after early_vf_enable are off by 1, since
early_vf_enable was not explicitly declared as array of size 1.

Reserved field before cqe_zip had a wrong size, it should
be 0x80 + 0x3f.

Fixes: b084444459 ("net/mlx5_core: Introduce access function to read internal timer ")
Fixes: b4ff3a36d3 ("net/mlx5: Use offset based reserved field names in the IFC header file")
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Matan Barak <matanb@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:21:09 -04:00
David S. Miller
993feee979 Merge branch 'qed-tunneling-offload'
Manish Chopra says:

====================
qed/qede: Add tunneling support

This patch series adds support for VXLAN, GRE and GENEVE tunnels
to be used over this driver. With this support, adapter can perform
TSO offload, inner/outer checksums offloads on TX and RX for
encapsulated packets.

V1->V2 [ Comments from Jesse Gross incorporated ]
* Drop general infrastructure change patch.
  "net: Make vxlan/geneve default udp ports public"
* Remove by default Linux default UDP ports configurations in driver.
  Instead, use general registration APIs for UDP port configurations
* Removing .ndo_features_check - we will add it later with proper change.

Please consider applying this series to net-next.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:08:09 -04:00
Manish Chopra
14db81defa qede: Add fastpath support for tunneling
This patch enables netdev tunneling features and adds
TX/RX fastpath support for tunneling in driver.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:08:09 -04:00
Manish Chopra
f798586920 qed: Enable GRE tunnel slowpath configuration
Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:08:09 -04:00
Manish Chopra
9a109dd073 qed/qede: Add GENEVE tunnel slowpath configuration support
This patch enables GENEVE tunnel on the adapter and
add support for driver hooks to configure UDP ports
for GENEVE tunnel offload to be performed by the adapter.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:08:08 -04:00
Manish Chopra
b18e170cac qed/qede: Add VXLAN tunnel slowpath configuration support
This patch enables VXLAN tunnel on the adapter and
add support for driver hooks to configure UDP ports
for VXLAN tunnel offload to be performed by the adapter.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:08:08 -04:00
Manish Chopra
464f664501 qed: Add infrastructure support for tunneling
This patch adds various structure/APIs needed to configure/enable different
tunnel [VXLAN/GRE/GENEVE] parameters on the adapter.

Signed-off-by: Manish Chopra <manish.chopra@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: Ariel Elior <Ariel.Elior@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:08:08 -04:00
Peter Heise
ee1c279772 net/hsr: Added support for HSR v1
This patch adds support for the newer version 1 of the HSR
networking standard. Version 0 is still default and the new
version has to be selected via iproute2.

Main changes are in the supervision frame handling and its
ethertype field.

Signed-off-by: Peter Heise <peter.heise@airbus.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 17:06:48 -04:00
Dan Williams
0a370d261c libnvdimm, pmem: clarify the write+clear_poison+write flow
The ACPI specification does not specify the state of data after a clear
poison operation.  Potential future libnvdimm bus implementations for
other architectures also might not specify or disagree on the state of
data after clear poison.  Clarify why we write twice.

Reported-by: Jeff Moyer <jmoyer@redhat.com>
Reported-by: Vishal Verma <vishal.l.verma@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
2016-04-15 14:59:41 -06:00
David S. Miller
125c8d1233 Merge branch 'tcp-synflood-perf'
Eric Dumazet says:

====================
tcp: final work on SYNFLOOD behavior

In the first patch, I remove the costly association of SYNACK+COOKIES
to a listener. I believe other parts of the stack should be ready.

The second patch removes a useless write into listener socket
in tcp_rcv_state_process(), incurring false sharing in
tcp_conn_request()

Performance under SYNFLOOD goes from 3.2 Mpps to 6 Mpps.

Test was using a single TCP listener, on a host with 8 RX queues
on the NIC, and 24 cores (48 ht)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 16:45:45 -04:00
Eric Dumazet
8804b2722d tcp: remove false sharing in tcp_rcv_state_process()
Last known hot point during SYNFLOOD attack is the clearing
of rx_opt.saw_tstamp in tcp_rcv_state_process()

It is not needed for a listener, so we move it where it matters.

Performance while a SYNFLOOD hits a single listener socket
went from 5 Mpps to 6 Mpps on my test server (24 cores, 8 NIC RX queues)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 16:45:44 -04:00
Eric Dumazet
b3d051477c tcp: do not mess with listener sk_wmem_alloc
When removing sk_refcnt manipulation on synflood, I missed that
using skb_set_owner_w() was racy, if sk->sk_wmem_alloc had already
transitioned to 0.

We should hold sk_refcnt instead, but this is a big deal under attack.
(Doing so increase performance from 3.2 Mpps to 3.8 Mpps only)

In this patch, I chose to not attach a socket to syncookies skb.

Performance is now 5 Mpps instead of 3.2 Mpps.

Following patch will remove last known false sharing in
tcp_rcv_state_process()

Fixes: 3b24d854cb ("tcp/dccp: do not touch listener sk_refcnt under synflood")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 16:45:44 -04:00
Amitoj Kaur Chawla
ac18dd9e84 qlge: Replace create_singlethread_workqueue with alloc_ordered_workqueue
Replace deprecated create_singlethread_workqueue with
alloc_ordered_workqueue.

Work items include getting tx/rx frame sizes, resetting MPI processor,
setting asic recovery bit so ordering seems necessary as only one work
item should be in queue/executing at any given time, hence the use of
alloc_ordered_workqueue.

WQ_MEM_RECLAIM flag has been set since ethernet devices seem to sit in
memory reclaim path, so to guarantee forward progress regardless of
memory pressure.

Signed-off-by: Amitoj Kaur Chawla <amitoj1606@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 16:42:10 -04:00
David S. Miller
0818556cd0 Merge branch 'tipc-link-setup-improvements'
Jon Maloy says:

====================
tipc: improvements to the link setup algorithm

This series addresses some smaller issues regarding the link setup
algorithm. The first commit fixes a rare bug we have discovered during
testing; the second one may have some future impact on cluster
scalabilty, while remaining ones can be regarded as cosmetic in
a wider sense of the word.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 16:09:07 -04:00
Jon Paul Maloy
34b9cd64c8 tipc: let first message on link be a state message
According to the link FSM, a received traffic packet can take a link
from state ESTABLISHING to ESTABLISHED, but the link can still not be
fully set up in one atomic operation. This means that even if the the
very first packet on the link is a traffic packet with sequence number
1 (one), it has to be dropped and retransmitted.

This can be avoided if we let the mentioned packet be preceded by a
LINK_PROTOCOL/STATE message, which takes up the endpoint before the
arrival of the traffic.

We add this small feature in this commit.

This is a fully compatible change.

Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-15 16:09:06 -04:00