Commit Graph

769982 Commits

Author SHA1 Message Date
Jakub Kicinski
f734607e81 xsk: refactor xdp_umem_assign_dev()
Return early and only take the ref on dev once there is no possibility
of failing.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Björn Töpel <bjorn.topel@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31 09:48:21 -07:00
Jakub Kicinski
c29c2ebd2a net: update real_num_rx_queues even when !CONFIG_SYSFS
We used to depend on real_num_rx_queues as a upper bound for sanity
checks.  For AF_XDP socket validation it's useful if the check behaves
the same regardless of CONFIG_SYSFS setting.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31 09:48:21 -07:00
Linus Torvalds
c1d61e7fe3 SCSI fixes on 20180731
Nine fixes, five in the qla2xxx driver, the most serious of which is
 the uninitialized list head crash which can be observed in most
 systems under a sufficiently loaded low memory environment.  The two
 sg fixes are minor but obvious and two target ones which seem
 reasonable but not high impact.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.vnet.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCW2B83yYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishfjaAQDN5p+O
 kB54JeM3Ae0IT7StDsK+LECoeGj2fYACBH+wUQD/eOZEXi6pDJ796VJWwjfqnxWb
 6Eonm3Qtikxj/Q4Z78w=
 =k2Af
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Nine fixes, five in the qla2xxx driver, the most serious of which is
  the uninitialized list head crash which can be observed in most
  systems under a sufficiently loaded low memory environment.

  The two sg fixes are minor but obvious and two target ones which seem
  reasonable but not high impact"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: Return error when TMF returns
  scsi: qla2xxx: Fix ISP recovery on unload
  scsi: qla2xxx: Fix driver unload by shutting down chip
  scsi: qla2xxx: Fix NPIV deletion by calling wait_for_sess_deletion
  scsi: qla2xxx: Fix unintialized List head crash
  scsi: sg: update comment for blk_get_request()
  scsi: sg: fix minor memory leak in error path
  scsi: libiscsi: fix possible NULL pointer dereference in case of TMF
  scsi: target: iscsi: cxgbit: fix max iso npdu calculation
2018-07-31 09:46:36 -07:00
Jesper Dangaard Brouer
39c64d8c87 mlx5: handle DMA mapping error case for XDP redirect
Commit 58b99ee3e3 ("net/mlx5e: Add support for XDP_REDIRECT in device-out side")
forgot to return/free the xdp_frame in case the DMA mapping failed, correct this.

Also DMA unmap the frame in case mlx5e_xmit_xdp_frame() fails.

Fixes: 58b99ee3e3 ("net/mlx5e: Add support for XDP_REDIRECT in device-out side")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-31 09:44:41 -07:00
Linus Torvalds
095c3633f1 virtio: last-minute fixes
Some bugfixes that seem important and safe enough to merge at the last
 minute.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJbXxanAAoJECgfDbjSjVRpDqoH/iqUcK2KkLQTUut4KWF0xjfs
 0AZxB9sT2fXNAwF4dDS0SywN8oXKRA0c83AY5WgtcJfndPYEhg4mZsiQThLN6GVw
 /CRVizpN1mSbn7ds4Xl5htD2Ml8OxRkAdulOXfG/DZ2eIiEgoQ6vPzDF2jqy1dIj
 yWjBNWoSKIqElkN310BlkDX0hjqSP9zr4kEDFSB7AHcsNRhUDcgRqFzA83bRZU0b
 qKNzeLwU28jYnjzBPjQ449lkHGXtSjSkBnxUOXHU2CVCiQ5I0rBPi/Xdhtd94gDM
 Bbl4Spf74IiwPdfMwk4pa6rO8JPLpYhrFHIG0Gkk8AFV4Gwh1IYQ9s3On3ADWSQ=
 =p1AG
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "Some bugfixes that seem important and safe enough to merge at the last
  minute"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio_balloon: fix another race between migration and ballooning
  tools/virtio: add kmalloc_array stub
  tools/virtio: add dma barrier stubs
2018-07-31 09:35:32 -07:00
Linus Torvalds
c786e4052c Urgent ACPI fixes for 4.18
- Fix a recent ACPICA regression introduced by a previous fix
    that caused control method execution at the table level to be
    mishandled by mistake (Erik Schmauss).
 
  - Fix a hibernation regression from the 4.15 cycle in the ACPI
    driver for Intel SoCs (LPSS) that caused the platform firmware
    to be confused during resume from hibernation by the driver's
    PM quirks which was fixed for system-wide suspend/resume (ACPI
    S3) earlier in this cycle, but that previous fix missed the
    hibernation (ACPI S4) case (Rafael Wysocki).
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJbYCTGAAoJEILEb/54YlRxWUkP/js8b6/wz5WEawmyPk6TTs7C
 e+cZyY0Y0L+rovwkAFi7l53mdN9Pso+9Wd99XavFYgIMfRb0tlT/0lO4ZbihATj4
 v59Vcns4lYLDa6EfkS7mVpIjsxB3lAokhG18iFqGloBGrugoW/ThBlni0ZVvalg7
 dml+6cv1XOvVKlEsZ8kloyyk+gB3T8QZvKU42PGD5oYNIye+JAPyse1AghJRpGqr
 8SwLWA/uhG22bNPDl9O3Djt7RVsiCScRURu7um4RtHVWk1ulplMPaX8hQ5j7tfOf
 HsToLQbSZz37odrvWeykKl3igP2wEfImHZm/ER6+CTCpaNQnWobINyitXV4ksHVr
 UO1it8YH3iDpAvX46MOXdjGuajUhpdTdteJzKiPnrv3W28dI3oyrxTb0AW1a0QUv
 hy8yYRDX71QcIKWLb/bmybgvHrOpIcZVn+FXSv1pmOesYoilgph+HN1CL2ef2EO3
 vwF2rm4U5GPHLaIvG+gP+8yoFzZhowUTwgdKlpC6DkXqzciq0wAAPQpFF11LH/BL
 rZ0Fr4vT7Yz6hxS/A4RplHYKheF2JDzFMVZmNql+OM3Z8ZBSLf2htkaYmDvRoP0H
 oTMSwNzlPhui5+HQ4qksMr7VfICByvvlQZ6te3bTThqcAgQYqXaXBo48sCluh+Ug
 2p5T4pvnp/aSqASKeFfP
 =N9R3
 -----END PGP SIGNATURE-----

Merge tag 'acpi-urgent-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael Wysocki:
 "These fix a recent ACPICA regression affecting control method
  execution at the table level and an earlier hibernation regression in
  the ACPI driver for Intel SoCs (LPSS) that was missed by a previous
  fix in this cycle.

  Specifics:

   - Fix a recent ACPICA regression introduced by a previous fix that
     caused control method execution at the table level to be mishandled
     by mistake (Erik Schmauss).

   - Fix a hibernation regression from the 4.15 cycle in the ACPI driver
     for Intel SoCs (LPSS) that caused the platform firmware to be
     confused during resume from hibernation by the driver's PM quirks
     which was fixed for system-wide suspend/resume (ACPI S3) earlier in
     this cycle, but that previous fix missed the hibernation (ACPI S4)
     case (Rafael Wysocki)"

* tag 'acpi-urgent-4.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPICA: AML Parser: ignore control method status in module-level code
  ACPI / LPSS: Avoid PM quirks on suspend and resume from hibernation
2018-07-31 09:31:18 -07:00
Hari Vyas
44bda4b7d2 PCI: Fix is_added/is_busmaster race condition
When a PCI device is detected, pdev->is_added is set to 1 and proc and
sysfs entries are created.

When the device is removed, pdev->is_added is checked for one and then
device is detached with clearing of proc and sys entries and at end,
pdev->is_added is set to 0.

is_added and is_busmaster are bit fields in pci_dev structure sharing same
memory location.

A strange issue was observed with multiple removal and rescan of a PCIe
NVMe device using sysfs commands where is_added flag was observed as zero
instead of one while removing device and proc,sys entries are not cleared.
This causes issue in later device addition with warning message
"proc_dir_entry" already registered.

Debugging revealed a race condition between the PCI core setting the
is_added bit in pci_bus_add_device() and the NVMe driver reset work-queue
setting the is_busmaster bit in pci_set_master().  As these fields are not
handled atomically, that clears the is_added bit.

Move the is_added bit to a separate private flag variable and use atomic
functions to set and retrieve the device addition state.  This avoids the
race because is_added no longer shares a memory location with is_busmaster.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=200283
Signed-off-by: Hari Vyas <hari.vyas@broadcom.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
2018-07-31 11:27:54 -05:00
Ard Biesheuvel
c7513c2a27 crypto/arm64: aes-ce-gcm - add missing kernel_neon_begin/end pair
Calling pmull_gcm_encrypt_block() requires kernel_neon_begin() and
kernel_neon_end() to be used since the routine touches the NEON
register file. Add the missing calls.

Also, since NEON register contents are not preserved outside of
a kernel mode NEON region, pass the key schedule array again.

Fixes: 7c50136a8a ("crypto: arm64/aes-ghash - yield NEON after every ...")
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-07-31 13:20:30 +01:00
Srinivas Pandruvada
01e61a42a5 cpufreq: intel_pstate: Limit the scope of HWP dynamic boost platforms
Dynamic boosting of HWP performance on IO wake showed significant
improvement to IO workloads. This series was intended for Skylake Xeon
platforms only and feature was enabled by default based on CPU model
number.

But some Xeon platforms reused the Skylake desktop CPU model number. This
caused some undesirable side effects to some graphics workloads. Since
they are heavily IO bound, the increase in CPU performance decreased the
power available for GPU to do its computing and hence decrease in graphics
benchmark performance.

For example on a Skylake desktop, GpuTest benchmark showed average FPS
reduction from 529 to 506.

This change makes sure that HWP boost feature is only enabled for Skylake
server platforms by using ACPI FADT preferred PM Profile. If some desktop
users wants to get benefit of boost, they can still enable boost from
intel_pstate sysfs attribute "hwp_dynamic_boost".

Fixes: 41ab43c9c8 (cpufreq: intel_pstate: enable boost for Skylake Xeon)
Link: https://bugs.freedesktop.org/show_bug.cgi?id=107410
Reported-by: Eero Tamminen <eero.t.tamminen@intel.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-07-31 10:39:58 +02:00
Rafael J. Wysocki
5f95d39b42 Merge branch 'acpi-soc'
Merge a fix for hibernation regression in the ACPI driver for Intel
SoCs (LPSS).

* acpi-soc:
  ACPI / LPSS: Avoid PM quirks on suspend and resume from hibernation
2018-07-31 10:35:47 +02:00
Ingo Molnar
ce03b6d2b6 perf/urgent fixes: (Arnaldo Carvalho de Melo)
- Update the tools copy of several files, including perf_event.h,
   powerpc's asm/unistd.h (new io_pgetevents syscall), bpf.h and
   x86's memcpy_64.s (used in 'perf bench mem'), silencing the
   respective warnings during the perf tools build.
 
 - Fix the build on the alpine:edge distro.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEELb9bqkb7Te0zijNb1lAW81NSqkAFAltfcukACgkQ1lAW81NS
 qkDlVA//fNnB+pDgqDI9MHhaAxPWxWmmBfz49g1i+Yv5MqBTSAXUkdP6MuTSHXkS
 KHGWBHYjb56gORqQZRLf9hFbiM1XYHL97uU55zKd/9zhuqhpIIDCEBq+RS9hwPzH
 DMrpHvOKHGwr0NE/O90HPYcnqSZUWarpwY8O+l1o5R6KW8EMneNvZypc7AbBcRfv
 P8QzdJ2NpeFgkvO6xEk0LnwrQQkgs/7VlpwJ0bsJ471Q9ujB+YgHd6PnX+hvxsy+
 QdTrpN0NSgBYtRPanprXyMnYsWEUhLlVqlY4qpGJBzdQKfMGWriG9a+jihKQXwdl
 ntgfs/aUphRhDcd0Rk6pR3dcozzB4mHYLI4ieN8t+QqisLJZV4RVppgLflq5sMAj
 xC3Gbn/GZqxLnh4qgoWvYndOEEDm3KNvEtCTe+rBVxdXvLksQevXrpMi8QbCZsZ9
 xU5DB42fK+8Vd9Bg89UxhI6BKgA8CDzHps3JXSp64314m4+0bswXdpCauhT3LUqg
 /W57VL2py8f8xrJqDmWohVH02qVgLU5RElFZWKmq04scfV8ypwEyWQSOb5MNCOgH
 +Malh+LbTrqz2VxIprjszNIejqp7dyoMgcJeGkvBJvwtw6uViubY30YOuwewwoGI
 UeNFwbNXzT6KgsJmlQCB6h9RnvY/vaU01rG0LSoxRjU9WReHPqw=
 =r5Gg
 -----END PGP SIGNATURE-----

Merge tag 'perf-urgent-for-mingo-4.18-20180730' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

- Update the tools copy of several files, including perf_event.h,
  powerpc's asm/unistd.h (new io_pgetevents syscall), bpf.h and
  x86's memcpy_64.s (used in 'perf bench mem'), silencing the
  respective warnings during the perf tools build.

- Fix the build on the alpine:edge distro.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-31 07:43:48 +02:00
Kan Liang
156c8b58ef perf/x86/intel/uncore: Fix hardcoded index of Broadwell extra PCI devices
Masayoshi Mizuma reported that a warning message is shown while a CPU is
hot-removed on Broadwell servers:

  WARNING: CPU: 126 PID: 6 at arch/x86/events/intel/uncore.c:988
  uncore_pci_remove+0x10b/0x150
  Call Trace:
   pci_device_remove+0x42/0xd0
   device_release_driver_internal+0x148/0x220
   pci_stop_bus_device+0x76/0xa0
   pci_stop_root_bus+0x44/0x60
   acpi_pci_root_remove+0x1f/0x80
   acpi_bus_trim+0x57/0x90
   acpi_bus_trim+0x2e/0x90
   acpi_device_hotplug+0x2bc/0x4b0
   acpi_hotplug_work_fn+0x1a/0x30
   process_one_work+0x174/0x3a0
   worker_thread+0x4c/0x3d0
   kthread+0xf8/0x130

This bug was introduced by:

  commit 15a3e845b0 ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs")

The index of "QPI Port 2 filter" was hardcode to 2, but this conflicts with the
index of "PCU.3" which is "HSWEP_PCI_PCU_3", which equals to 2 as well.

To fix the conflict, the hardcoded index needs to be cleaned up:

 - introduce a new enumerator "BDX_PCI_QPI_PORT2_FILTER" for "QPI Port 2
   filter" on Broadwell,
 - increase UNCORE_EXTRA_PCI_DEV_MAX by one,
 - clean up the hardcoded index.

Debugged-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Reported-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: msys.mizuma@gmail.com
Cc: stable@vger.kernel.org
Fixes: 15a3e845b0 ("perf/x86/intel/uncore: Fix SBOX support for Broadwell CPUs")
Link: http://lkml.kernel.org/r/1532953688-15008-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-07-31 07:43:37 +02:00
Linus Torvalds
f67077deb4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Several smallish fixes, I don't think any of this requires another -rc
  but I'll leave that up to you:

   1) Don't leak uninitialzed bytes to userspace in xfrm_user, from Eric
      Dumazet.

   2) Route leak in xfrm_lookup_route(), from Tommi Rantala.

   3) Premature poll() returns in AF_XDP, from Björn Töpel.

   4) devlink leak in netdevsim, from Jakub Kicinski.

   5) Don't BUG_ON in fib_compute_spec_dst, the condition can
      legitimately happen. From Lorenzo Bianconi.

   6) Fix some spectre v1 gadgets in generic socket code, from Jeremy
      Cline.

   7) Don't allow user to bind to out of range multicast groups, from
      Dmitry Safonov with a follow-up by Dmitry Safonov.

   8) Fix metrics leak in fib6_drop_pcpu_from(), from Sabrina Dubroca"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits)
  netlink: Don't shift with UB on nlk->ngroups
  net/ipv6: fix metrics leak
  xen-netfront: wait xenbus state change when load module manually
  can: ems_usb: Fix memory leak on ems_usb_disconnect()
  openvswitch: meter: Fix setting meter id for new entries
  netlink: Do not subscribe to non-existent groups
  NET: stmmac: align DMA stuff to largest cache line length
  tcp_bbr: fix bw probing to raise in-flight data for very small BDPs
  net: socket: Fix potential spectre v1 gadget in sock_is_registered
  net: socket: fix potential spectre v1 gadget in socketcall
  net: mdio-mux: bcm-iproc: fix wrong getter and setter pair
  ipv4: remove BUG_ON() from fib_compute_spec_dst
  enic: handle mtu change for vf properly
  net: lan78xx: fix rx handling before first packet is send
  nfp: flower: fix port metadata conversion bug
  bpf: use GFP_ATOMIC instead of GFP_KERNEL in bpf_parse_prog()
  bpf: fix bpf_skb_load_bytes_relative pkt length check
  perf build: Build error in libbpf missing initialization
  net: ena: Fix use of uninitialized DMA address bits field
  bpf: btf: Use exact btf value_size match in map_check_btf()
  ...
2018-07-30 21:40:37 -07:00
Linus Torvalds
5723b4a3cc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
 "Some small __init annotation and build fixes from Stephen Rostedt and
  Thomas Petazzoni"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: use asm-generic version of msi.h
  sparc: move MSI related definitions to where they are used
  sparc/time: Add missing __init to init_tick_ops()
2018-07-30 18:36:20 -07:00
Linus Torvalds
d512584780 squashfs: more metadata hardening
Anatoly reports another squashfs fuzzing issue, where the decompression
parameters themselves are in a compressed block.

This causes squashfs_read_data() to be called in order to read the
decompression options before the decompression stream having been set
up, making squashfs go sideways.

Reported-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Acked-by: Phillip Lougher <phillip.lougher@gmail.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-07-30 17:29:17 -07:00
Jakub Kicinski
2d55d614fc net: xsk: don't return frames via the allocator on error
xdp_return_buff() is used when frame has been successfully
handled (transmitted) or if an error occurred during delayed
processing and there is no way to report it back to
xdp_do_redirect().

In case of __xsk_rcv_zc() error is propagated all the way
back to the driver, so there is no need to call
xdp_return_buff().  Driver will recycle the frame anyway
after seeing that error happened.

Fixes: 173d3adb6f ("xsk: add zero-copy support for Rx")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-31 02:03:19 +02:00
Yonghong Song
573b3aa694 tools/bpftool: fix a percpu_array map dump problem
I hit the following problem when I tried to use bpftool
to dump a percpu array.

  $ sudo ./bpftool map show
  61: percpu_array  name stub  flags 0x0
          key 4B  value 4B  max_entries 1  memlock 4096B
  ...
  $ sudo ./bpftool map dump id 61
  bpftool: malloc.c:2406: sysmalloc: Assertion
  `(old_top == initial_top (av) && old_size == 0) || \
   ((unsigned long) (old_size) >= MINSIZE && \
   prev_inuse (old_top) && \
   ((unsigned long) old_end & (pagesize - 1)) == 0)'
  failed.
  Aborted

Further debugging revealed that this is due to
miscommunication between bpftool and kernel.
For example, for the above percpu_array with value size of 4B.
The map info returned to user space has value size of 4B.

In bpftool, the values array for lookup is allocated like:
   info->value_size * get_possible_cpus() = 4 * get_possible_cpus()
In kernel (kernel/bpf/syscall.c), the values array size is
rounded up to multiple of 8.
   round_up(map->value_size, 8) * num_possible_cpus()
   = 8 * num_possible_cpus()
So when kernel copies the values to user buffer, the kernel will
overwrite beyond user buffer boundary.

This patch fixed the issue by allocating and stepping through
percpu map value array properly in bpftool.

Fixes: 71bb428fe2 ("tools: bpf: add bpftool")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-07-31 00:37:09 +02:00
Yi Wang
b305f7ed0f audit: fix potential null dereference 'context->module.name'
The variable 'context->module.name' may be null pointer when
kmalloc return null, so it's better to check it before using
to avoid null dereference.
Another one more thing this patch does is using kstrdup instead
of (kmalloc + strcpy), and signal a lost record via audit_log_lost.

Cc: stable@vger.kernel.org # 4.11
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Reviewed-by: Jiang Biao <jiang.biao2@zte.com.cn>
Reviewed-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
2018-07-30 18:09:37 -04:00
Thomas Petazzoni
12be1036c5 sparc: use asm-generic version of msi.h
This is necessary to be able to include <linux/msi.h> when
CONFIG_GENERIC_MSI_IRQ_DOMAIN is enabled. Without this, a build with
CONFIG_GENERIC_MSI_IRQ_DOMAIN fails with:

   In file included from drivers//ata/ahci.c:45:0:
>> include/linux/msi.h:226:10: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
             msi_alloc_info_t *arg);
             ^~~~~~~~~~~~~~~~
             sg_alloc_fn
   include/linux/msi.h:230:9: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
            msi_alloc_info_t *arg);
            ^~~~~~~~~~~~~~~~
            sg_alloc_fn
   include/linux/msi.h:239:12: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
               msi_alloc_info_t *arg);
               ^~~~~~~~~~~~~~~~
               sg_alloc_fn
   include/linux/msi.h:240:22: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
     void  (*msi_finish)(msi_alloc_info_t *arg, int retval);
                         ^~~~~~~~~~~~~~~~
                         sg_alloc_fn
   include/linux/msi.h:241:20: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
     void  (*set_desc)(msi_alloc_info_t *arg,
                       ^~~~~~~~~~~~~~~~
                       sg_alloc_fn
   include/linux/msi.h:316:18: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
           int nvec, msi_alloc_info_t *args);
                     ^~~~~~~~~~~~~~~~
                     sg_alloc_fn
   include/linux/msi.h:318:29: error: unknown type name 'msi_alloc_info_t'; did you mean 'sg_alloc_fn'?
            int virq, int nvec, msi_alloc_info_t *args);
                                ^~~~~~~~~~~~~~~~
                                sg_alloc_fn

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 13:00:56 -07:00
Thomas Petazzoni
f0afc6b18d sparc: move MSI related definitions to where they are used
The definitions in arch/sparc/include/asm/msi.h are only used in
arch/sparc/mm/srmmu.c, so it makes sense to have them in the C file
directly.

In addition, having a custom arch/sparc/include/asm/msi.h prevents
from using the asm-generic version of this header, which is necessary
to be able to include <linux/msi.h> when CONFIG_GENERIC_MSI_IRQ_DOMAIN
is enabled.

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 13:00:56 -07:00
Steven Rostedt (VMware)
6f57ed681e sparc/time: Add missing __init to init_tick_ops()
Code that was added to force gcc not to inline any function that isn't
explicitly declared as inline uncovered that init_tick_ops() isn't
marked as "__init". It is only called by __init functions and more
importantly it too calls an __init function which would require it to be
__init as well.

Link: http://lkml.kernel.org/r/201806060444.hdHcKOBy%fengguang.wu@intel.com

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 12:48:29 -07:00
Dmitry Safonov
61f4b23769 netlink: Don't shift with UB on nlk->ngroups
On i386 nlk->ngroups might be 32 or 0. Which leads to UB, resulting in
hang during boot.
Check for 0 ngroups and use (unsigned long long) as a type to shift.

Fixes: 7acf9d4237 ("netlink: Do not subscribe to non-existent groups").
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 12:42:22 -07:00
Yidong Ren
6ae7467112 hv_netvsc: Add per-cpu ethtool stats for netvsc
This patch implements following ethtool stats fields for netvsc:
cpu<n>_tx/rx_packets/bytes
cpu<n>_vf_tx/rx_packets/bytes

Corresponding per-cpu counters already exist in current code. Exposing
these counters will help troubleshooting performance issues.

for_each_present_cpu() was used instead of for_each_possible_cpu().
for_each_possible_cpu() would create very long and useless output.
It is still being used for internal buffer, but not for ethtool
output.

There could be an overflow if cpu was added between ethtool
call netvsc_get_sset_count() and netvsc_get_ethtool_stats() and
netvsc_get_strings(). (still safe if cpu was removed)
ethtool makes these three function calls separately.
As long as we use ethtool, I can't see any clean solution.

Currently and in foreseeable short term, Hyper-V doesn't support
cpu hot-plug. Plus, ethtool is for admin use. Unlikely the admin
would perform such combo operations.

Changes in v2:
  - Remove cpp style comment
  - Resubmit after freeze

Changes in v3:
  - Reimplemented with kvmalloc instead of alloc_percpu

Changes in v4:
  - Fixed inconsistent array size
  - Use kvmalloc_array instead of kvmalloc

Signed-off-by: Yidong Ren <yidren@microsoft.com>
Reviewed-by: Stephen Hemminger <sthemmin@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 12:35:04 -07:00
David S. Miller
af87f72e75 linux-can-fixes-for-4.18-20180730
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEENrCndlB/VnAEWuH5k9IU1zQoZfEFAlte1NUTHG1rbEBwZW5n
 dXRyb25peC5kZQAKCRCT0hTXNChl8Z0zB/9CkuxbUOB8QsoorZ6NwoVcK4wNZNhN
 SRB8BpJpZn+Bye5hxKvMXU3yeoZj5xIR4mopLiiKM/pp3y3yqk3uhh971SQmAhfa
 ULv+leBiCaFatXgYQTngOi6Xg6faJvu6ZYKUzvNPT1gW40HQTYUCCJwdgvlCbGLs
 AVa04gV6o5lSolCboEjeKLnsW9ByHBBkLaOOZCixt+VfS76gO29fYJmplz7WC9pW
 Udw2KiQBSz/D5iMUoYxb9KiMAa5OL9dJfHfy4jIdsEMekBUj66KjzbFOBq90c1Pg
 PVEABATxFTplL1He+PzkJLu15csnhH0VSurUenkoOfdMkbu1LCmdwd4S
 =053y
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-fixes-for-4.18-20180730' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can

Marc Kleine-Budde says:

====================
pull-request: can 2018-07-30

this is a pull request of one patch for net/master.

The patch by Anton Vasilyev and the Linux Driver Verification project
fixes a memory leak in the ems_usb driver's disconnect function.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 12:32:09 -07:00
Linus Torvalds
527838d470 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Misc fixes:

   - a build race fix

   - a Xen entry fix

   - a TSC_DEADLINE quirk future-proofing fix"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Fix if_changed build flip/flop bug
  x86/entry/64: Remove %ebx handling from error_entry/exit
  x86/apic: Future-proof the TSC_DEADLINE quirk for SKX
2018-07-30 12:16:03 -07:00
Linus Torvalds
ae3e10aba5 Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull scheduler fixes from Ingo Molnar:
 "Misc fixes:

   - a deadline scheduler related bug fix which triggered a kernel
     warning

   - an RT_RUNTIME_SHARE fix

   - a stop_machine preemption fix

   - a potential NULL dereference fix in sched_domain_debug_one()"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/rt: Restore rt_runtime after disabling RT_RUNTIME_SHARE
  sched/deadline: Update rq_clock of later_rq when pushing a task
  stop_machine: Disable preemption after queueing stopper threads
  sched/topology: Check variable group before dereferencing it
2018-07-30 12:13:56 -07:00
Randy Dunlap
ec837d620c arc: fix type warnings in arc/mm/cache.c
Fix type warnings in arch/arc/mm/cache.c.

../arch/arc/mm/cache.c: In function 'flush_anon_page':
../arch/arc/mm/cache.c:1062:55: warning: passing argument 2 of '__flush_dcache_page' makes integer from pointer without a cast [-Wint-conversion]
  __flush_dcache_page((phys_addr_t)page_address(page), page_address(page));
                                                       ^~~~~~~~~~~~~~~~~~
../arch/arc/mm/cache.c:1013:59: note: expected 'long unsigned int' but argument is of type 'void *'
 void __flush_dcache_page(phys_addr_t paddr, unsigned long vaddr)
                                             ~~~~~~~~~~~~~~^~~~~

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Elad Kanfi <eladkan@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Ofer Levi <oferle@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30 11:48:50 -07:00
Randy Dunlap
2423665ec5 arc: fix build errors in arc/include/asm/delay.h
Fix build errors in arch/arc/'s delay.h:
- add "extern unsigned long loops_per_jiffy;"
- add <asm-generic/types.h> for "u64"

In file included from ../drivers/infiniband/hw/cxgb3/cxio_hal.c:32:
../arch/arc/include/asm/delay.h: In function '__udelay':
../arch/arc/include/asm/delay.h:61:12: error: 'u64' undeclared (first use in this function)
  loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32;
            ^~~

In file included from ../drivers/infiniband/hw/cxgb3/cxio_hal.c:32:
../arch/arc/include/asm/delay.h: In function '__udelay':
../arch/arc/include/asm/delay.h:63:37: error: 'loops_per_jiffy' undeclared (first use in this function)
  loops = ((u64) usecs * 4295 * HZ * loops_per_jiffy) >> 32;
                                     ^~~~~~~~~~~~~~~

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: linux-snps-arc@lists.infradead.org
Cc: Elad Kanfi <eladkan@mellanox.com>
Cc: Leon Romanovsky <leonro@mellanox.com>
Cc: Ofer Levi <oferle@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30 11:48:50 -07:00
Randy Dunlap
9e2ea40554 arc: [plat-eznps] fix printk warning in arc/plat-eznps/mtm.c
Fix printk format warning in arch/arc/plat-eznps/mtm.c:

In file included from ../include/linux/printk.h:7,
                 from ../include/linux/kernel.h:14,
                 from ../include/linux/list.h:9,
                 from ../include/linux/smp.h:12,
                 from ../arch/arc/plat-eznps/mtm.c:17:
../arch/arc/plat-eznps/mtm.c: In function 'set_mtm_hs_ctr':
../include/linux/kern_levels.h:5:18: warning: format '%d' expects argument of type 'int', but argument 2 has type 'long int' [-Wformat=]
 #define KERN_SOH "\001"  /* ASCII Start Of Header */
                  ^~~~~~
../include/linux/kern_levels.h:11:18: note: in expansion of macro 'KERN_SOH'
 #define KERN_ERR KERN_SOH "3" /* error conditions */
                  ^~~~~~~~
../include/linux/printk.h:308:9: note: in expansion of macro 'KERN_ERR'
  printk(KERN_ERR pr_fmt(fmt), ##__VA_ARGS__)
         ^~~~~~~~
../arch/arc/plat-eznps/mtm.c:166:3: note: in expansion of macro 'pr_err'
   pr_err("** Invalid @nps_mtm_hs_ctr [%d] needs to be [%d:%d] (incl)\n",
   ^~~~~~
../arch/arc/plat-eznps/mtm.c:166:40: note: format string is defined here
   pr_err("** Invalid @nps_mtm_hs_ctr [%d] needs to be [%d:%d] (incl)\n",
                                       ~^
                                       %ld
The hs_ctr variable can just be int instead of long, so also change
kstrtol() to kstrtoint() and leave the format string as %d.

Also add 2 header files since they are used in mtm.c and we prefer
not to depend on accidental/indirect #includes.

Cc: linux-snps-arc@lists.infradead.org
Cc: Ofer Levi <oferle@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30 11:48:49 -07:00
Randy Dunlap
b1f32ce1c3 arc: [plat-eznps] fix data type errors in platform headers
Add <linux/types.h> to fix build errors.
Both ctop.h and <soc/nps/common.h> use u32 types and cause many
errors.

Examples:
../include/soc/nps/common.h:71:4: error: unknown type name 'u32'
    u32 __reserved:20, cluster:4, core:4, thread:4;
../include/soc/nps/common.h:76:3: error: unknown type name 'u32'
   u32 value;
../include/soc/nps/common.h:124:4: error: unknown type name 'u32'
    u32 base:8, cl_x:4, cl_y:4,
../include/soc/nps/common.h:127:3: error: unknown type name 'u32'
   u32 value;

../arch/arc/plat-eznps/include/plat/ctop.h:83:4: error: unknown type name 'u32'
    u32 gen:1, gdis:1, clk_gate_dis:1, asb:1,
../arch/arc/plat-eznps/include/plat/ctop.h:86:3: error: unknown type name 'u32'
   u32 value;
../arch/arc/plat-eznps/include/plat/ctop.h:93:4: error: unknown type name 'u32'
    u32 csa:22, dmsid:6, __reserved:3, cs:1;
../arch/arc/plat-eznps/include/plat/ctop.h:95:3: error: unknown type name 'u32'
   u32 value;

Cc: linux-snps-arc@lists.infradead.org
Cc: Ofer Levi <oferle@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30 11:48:49 -07:00
Linus Torvalds
0634922a78 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Misc fixes:

   - AMD IBS data corruptor fix (uncovered by UBSAN)

   - an Intel PEBS entry unwind error fix

   - a HW-tracing crash fix

   - a MAINTAINERS update"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/core: Fix crash when using HW tracing kernel filters
  perf/x86/intel: Fix unwind errors from PEBS entries (mk-II)
  MAINTAINERS: Add Naveen N. Rao as kprobes co-maintainer
  perf/x86/amd/ibs: Don't access non-started event
2018-07-30 11:45:30 -07:00
Linus Torvalds
fb20c03d37 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Ingo Molnar:
 "A paravirt UP-patching fix, and an I2C MUX driver lockdep warning fix"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/pvqspinlock/x86: Use LOCK_PREFIX in __pv_queued_spin_unlock() assembly code
  i2c/mux, locking/core: Annotate the nested rt_mutex usage
  locking/rtmutex: Allow specifying a subclass for nested locking
2018-07-30 11:37:16 -07:00
Linus Torvalds
d464b0314c Merge branch 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI fix from Ingo Molnar:
 "An UEFI variables fix for SEV guests"

* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/efi: Access EFI MMIO data as unencrypted when SEV is active
2018-07-30 11:07:34 -07:00
Ofer Levi
05b466bf84 ARC: [plat-eznps] Add missing struct nps_host_reg_aux_dpc
Fixing compilation issue caused by missing struct nps_host_reg_aux_dpc
definition.

Fixes: 3f9cd874dc ("ARC: [plat-eznps] avoid toggling of DPC register")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Ofer Levi <oferle@mellanox.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30 09:47:53 -07:00
David S. Miller
494f2e765d Merge branch 'selftests-mirror-to-gretap-with-team'
Petr Machata says:

====================
A test for mirror-to-gretap with team in UL packet path

This patchset adds a test for "tc action mirred mirror" where the
mirrored-to device is a gretap, and underlay path contains a team
device.

In patch #1 require_command() is added, which should henceforth be used
to declare dependence on a certain tool.

In patch #2, two new functions, team_create() and team_destroy(), are
added to lib.sh.

The newly-added test uses arping, which isn't necessarily available.
Therefore patch #3 introduces $ARPING, and a preexisting test is fixed
to require_command $ARPING.

In patches #4 and #5, two new tests are added. In both cases, a team
device is on egress path of a mirrored packet in a mirror-to-gretap
scenario. In the first one, the team device is in loadbalance mode, in
the second one it's in lacp mode. (The difference in modes necessitates
a different testing strategy, hence two test cases instead of just
parameterizing one.)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:47:22 -07:00
Petr Machata
541c6ce30f selftests: forwarding: Test mirror-to-gretap w/ UL team LACP
This tests mirror-to-gretap when an underlay packet path includes a team
device which is not in loadbalance mode, but in LACP mode. The test
manipulates LAG membership to achieve changes in txability, thus making
sure that a driver that offloads mirror-to-gretap doesn't just consider
upness of a device.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:47:21 -07:00
Petr Machata
a9b33b2001 selftests: forwarding: Test mirror-to-gretap w/ UL team
Test for "tc action mirred egress mirror" that mirrors to gretap when
the underlay route points at a VLAN-aware bridge (802.1q), and the
traffic egresses the bridge through a team device. Test upping and
downing individual team device slaves and verify the traffic flows as
expected.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:47:21 -07:00
Petr Machata
ca70a56238 selftests: forwarding: Introduce $ARPING
Instead of relying on "arping" being installed everywhere under that
name, introduce a variable $ARPING like the other tools do.

Convert an existing test, mirror_gre_vlan_bridge_1q.sh to
require_command $ARPING and then invoke arping through the variable.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:47:21 -07:00
Petr Machata
9d9e6bde3d selftests: forwarding: lib: Support team devices
Add team_create() and team_destroy() to manage team netdevices.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:47:21 -07:00
Petr Machata
e094574f9b selftests: forwarding: lib: Add require_command()
The logic for testing whether a certain command is available is used
several times in the current code base. The tests in follow-up patches
add more requirements like that.

Therefore extract the logic into a named function, require_command(),
that can be used directly from lib.sh as well as from any test that
wishes to declare dependence on some command.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:47:21 -07:00
Eugeniy Paltsev
386177da9e ARC: add SMP_CACHE_BYTES value validate
Check that SMP_CACHE_BYTES (and hence ARCH_DMA_MINALIGN) is larger
or equal to any cache line length by comparing it with values
previously read from ARC cache BCR registers.

Signed-off-by: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
2018-07-30 09:46:19 -07:00
Sabrina Dubroca
df18b50448 net/ipv6: fix metrics leak
Since commit d4ead6b34b ("net/ipv6: move metrics from dst to
rt6_info"), ipv6 metrics are shared and refcounted. rt6_set_from()
assigns the rt->from pointer and increases the refcount on from's
metrics. This reference is never released.

Introduce the fib6_metrics_release() helper and use it to release the
metrics.

Fixes: d4ead6b34b ("net/ipv6: move metrics from dst to rt6_info")
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:45:57 -07:00
YueHaibing
778c4d5c5b fib_rules: NULL check before kfree is not needed
kfree(NULL) is safe,so this removes NULL check before freeing the mem

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:44:06 -07:00
Quentin Schulz
86ff73622a net: phy: mscc: the extended page access register is 16 bits
The Extended Page Access is a 16-bit register, so change the page
parameter of vsc85xx_phy_page_set to a u16.

Signed-off-by: Quentin Schulz <quentin.schulz@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:42:32 -07:00
Vakul Garg
ad13acce8d net/tls: Use socket data_ready callback on record availability
On receipt of a complete tls record, use socket's saved data_ready
callback instead of state_change callback. In function tls_queue(),
the TLS record is queued in encrypted state. But the decryption
happen inline when tls_sw_recvmsg() or tls_sw_splice_read() get invoked.
So it should be ok to notify the waiting context about the availability
of data as soon as we could collect a full TLS record. For new data
availability notification, sk_data_ready callback is more appropriate.
It points to sock_def_readable() which wakes up specifically for EPOLLIN
event. This is in contrast to the socket callback sk_state_change which
points to sock_def_wakeup() which issues a wakeup unconditionally
(without event mask).

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:41:41 -07:00
Xiao Liang
822fb18a82 xen-netfront: wait xenbus state change when load module manually
When loading module manually, after call xenbus_switch_state to initializes
the state of the netfront device, the driver state did not change so fast
that may lead no dev created in latest kernel. This patch adds wait to make
sure xenbus knows the driver is not in closed/unknown state.

Current state:
[vm]# ethtool eth0
Settings for eth0:
	Link detected: yes
[vm]# modprobe -r xen_netfront
[vm]# modprobe  xen_netfront
[vm]# ethtool eth0
Settings for eth0:
Cannot get device settings: No such device
Cannot get wake-on-lan settings: No such device
Cannot get message level: No such device
Cannot get link status: No such device
No data available

With the patch installed.
[vm]# ethtool eth0
Settings for eth0:
	Link detected: yes
[vm]# modprobe -r xen_netfront
[vm]# modprobe xen_netfront
[vm]# ethtool eth0
Settings for eth0:
	Link detected: yes

Signed-off-by: Xiao Liang <xiliang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:40:19 -07:00
David S. Miller
8f3f6500c7 Merge branch 'TC-refactor-act_mirred-packets-re-injection'
Paolo Abeni says:

====================
TC: refactor act_mirred packets re-injection

This series is aimed at improving the act_mirred redirect performances.
Such action is used by OVS to represent TC S/W flows, and it's current largest
bottle-neck is the need for a skb_clone() for each packet.

The first 2 patches introduce some cleanup and safeguards to allow extending
tca_result - we will use it to store RCU protected redirect information - and
introduce a clear separation between user-space accessible tcfa_action
values and internal values accessible only by the kernel.
Then a new tcfa_action value is introduced: TC_ACT_REINJECT, similar to
TC_ACT_REDIRECT, but preserving the mirred semantic. Such value is not
accessible from user-space.
The last patch exploits the newly introduced infrastructure in the act_mirred
action, to avoid a skb_clone, when possible.

Overall this the above gives a ~10% performance improvement in forwarding tput,
when using the TC S/W datapath.

v1 -> v2:
 - preserve the rcu lock in act_bpf
 - add and use a new action value to reinject the packets, preserving the mirred
   semantic

v2 -> v3:
 - renamed to new action as TC_ACT_REINJECT
 - TC_ACT_REINJECT is not exposed to user-space

v3 -> v4:
 - dropped the TC_ACT_REDIRECT patch
 - report failure via extack, too
 - rename the new action as TC_ACT_REINSERT
 - skip clone only if the control action don't touch tcf_result

v4 -> v5:
 - fix a couple of build issue reported by kbuild bot
 - dont split messages
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:31:14 -07:00
Paolo Abeni
e5cf1baf92 act_mirred: use TC_ACT_REINSERT when possible
When mirred is invoked from the ingress path, and it wants to redirect
the processed packet, it can now use the TC_ACT_REINSERT action,
filling the tcf_result accordingly, and avoiding a per packet
skb_clone().

Overall this gives a ~10% improvement in forwarding performance for the
TC S/W data path and TC S/W performances are now comparable to the
kernel openvswitch datapath.

v1 -> v2: use ACT_MIRRED instead of ACT_REDIRECT
v2 -> v3: updated after action rename, fixed typo into the commit
	message
v3 -> v4: updated again after action rename, added more comments to
	the code (JiriP), skip the optimization if the control action
	need to touch the tcf_result (Paolo)
v4 -> v5: fix sparse warning (kbuild bot)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:31:14 -07:00
Paolo Abeni
cd11b16407 net/tc: introduce TC_ACT_REINSERT.
This is similar TC_ACT_REDIRECT, but with a slightly different
semantic:
- on ingress the mirred skbs are passed to the target device
network stack without any additional check not scrubbing.
- the rcu-protected stats provided via the tcf_result struct
  are updated on error conditions.

This new tcfa_action value is not exposed to the user-space
and can be used only internally by clsact.

v1 -> v2: do not touch TC_ACT_REDIRECT code path, introduce
 a new action type instead
v2 -> v3:
 - rename the new action value TC_ACT_REINJECT, update the
   helper accordingly
 - take care of uncloned reinjected packets in XDP generic
   hook
v3 -> v4:
 - renamed again the new action value (JiriP)
v4 -> v5:
 - fix build error with !NET_CLS_ACT (kbuild bot)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:31:14 -07:00
Paolo Abeni
7fd4b288ea tc/act: remove unneeded RCU lock in action callback
Each lockless action currently does its own RCU locking in ->act().
This allows using plain RCU accessor, even if the context
is really RCU BH.

This change drops the per action RCU lock, replace the accessors
with the _bh variant, cleans up a bit the surrounding code and
documents the RCU status in the relevant header.
No functional nor performance change is intended.

The goal of this patch is clarifying that the RCU critical section
used by the tc actions extends up to the classifier's caller.

v1 -> v2:
 - preserve rcu lock in act_bpf: it's needed by eBPF helpers,
   as pointed out by Daniel

v3 -> v4:
 - fixed some typos in the commit message (JiriP)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2018-07-30 09:31:13 -07:00