Pull drm updates from Dave Airlie:
"Core:
- Fence destaging work
- DRIVER_LEGACY to split off legacy drm drivers
- drm_mm refactoring
- Splitting drm_crtc.c into chunks and documenting better
- Display info fixes
- rbtree support for prime buffer lookup
- Simple VGA DAC driver
Panel:
- Add Nexus 7 panel
- More simple panels
i915:
- Refactoring GEM naming
- Refactored vma/active tracking
- Lockless request lookups
- Better stolen memory support
- FBC fixes
- SKL watermark fixes
- VGPU improvements
- dma-buf fencing support
- Better DP dongle support
amdgpu:
- Powerplay for Iceland asics
- Improved GPU reset support
- UVD/VEC powergating support for CZ/ST
- Preinitialised VRAM buffer support
- Virtual display support
- Initial SI support
- GTT rework
- PCI shutdown callback support
- HPD IRQ storm fixes
amdkfd:
- bugfixes
tilcdc:
- Atomic modesetting support
mediatek:
- AAL + GAMMA engine support
- Hook up gamma LUT
- Temporal dithering support
imx:
- Pixel clock from devicetree
- drm bridge support for LVDS bridges
- active plane reconfiguration
- VDIC deinterlacer support
- Frame synchronisation unit support
- Color space conversion support
analogix:
- PSR support
- Better panel on/off support
rockchip:
- rk3399 vop/crtc support
- PSR support
vc4:
- Interlaced vblank timing
- 3D rendering CPU overhead reduction
- HDMI output fixes
tda998x:
- HDMI audio ASoC support
sunxi:
- Allwinner A33 support
- better TCON support
msm:
- DT binding cleanups
- Explicit fence-fd support
sti:
- remove sti415/416 support
etnaviv:
- MMUv2 refactoring
- GC3000 support
exynos:
- Refactoring HDMI DCC/PHY
- G2D pm regression fix
- Page fault issues with wait for vblank
There is no nouveau work in this tree, as Ben didn't get a pull
request in, and he was fighting moving to atomic and adding mst
support, so maybe best it waits for a cycle"
* tag 'drm-for-v4.9' of git://people.freedesktop.org/~airlied/linux: (1412 commits)
drm/crtc: constify drm_crtc_index parameter
drm/i915: Fix conflict resolution from backmerge of v4.8-rc8 to drm-next
drm/i915/guc: Unwind GuC workqueue reservation if request construction fails
drm/i915: Reset the breadcrumbs IRQ more carefully
drm/i915: Force relocations via cpu if we run out of idle aperture
drm/i915: Distinguish last emitted request from last submitted request
drm/i915: Allow DP to work w/o EDID
drm/i915: Move long hpd handling into the hotplug work
drm/i915/execlists: Reinitialise context image after GPU hang
drm/i915: Use correct index for backtracking HUNG semaphores
drm/i915: Unalias obj->phys_handle and obj->userptr
drm/i915: Just clear the mmiodebug before a register access
drm/i915/gen9: only add the planes actually affected by ddb changes
drm/i915: Allow PCH DPLL sharing regardless of DPLL_SDVO_HIGH_SPEED
drm/i915/bxt: Fix HDMI DPLL configuration
drm/i915/gen9: fix the watermark res_blocks value
drm/i915/gen9: fix plane_blocks_per_line on watermarks calculations
drm/i915/gen9: minimum scanlines for Y tile is not always 4
drm/i915/gen9: fix the WaWmMemoryReadLatency implementation
drm/i915/kbl: KBL also needs to run the SAGV code
...
Merge more updates from Andrew Morton:
- a few block updates that fell in my lap
- lib/ updates
- checkpatch
- autofs
- ipc
- a ton of misc other things
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (100 commits)
mm: split gfp_mask and mapping flags into separate fields
fs: use mapping_set_error instead of opencoded set_bit
treewide: remove redundant #include <linux/kconfig.h>
hung_task: allow hung_task_panic when hung_task_warnings is 0
kthread: add kerneldoc for kthread_create()
kthread: better support freezable kthread workers
kthread: allow to modify delayed kthread work
kthread: allow to cancel kthread work
kthread: initial support for delayed kthread work
kthread: detect when a kthread work is used by more workers
kthread: add kthread_destroy_worker()
kthread: add kthread_create_worker*()
kthread: allow to call __kthread_create_on_node() with va_list args
kthread/smpboot: do not park in kthread_create_on_cpu()
kthread: kthread worker API cleanup
kthread: rename probe_kthread_data() to kthread_probe_data()
scripts/tags.sh: enable code completion in VIM
mm: kmemleak: avoid using __va() on addresses that don't have a lowmem mapping
kdump, vmcoreinfo: report memory sections virtual addresses
ipc/sem.c: add cond_resched in exit_sme
...
Since linux/auto_dev-ioctl.h wasn't included in include/linux/Kbuild
it wasn't moved to uapi/linux as part of the uapi series.
Link: http://lkml.kernel.org/r/20160812024901.12352.10984.stgit@pluto.themaw.net
Signed-off-by: Ian Kent <raven@themaw.net>
Cc: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
linux/limits.h should be included by uapi instead of linux/auto_fs.h
so as not to cause compile error in userspace.
# cat << EOF > ./test1.c
> #include <stdio.h>
> #include <linux/auto_fs.h>
> int main(void) {
> return 0;
> }
> EOF
# gcc -Wall -g ./test1.c
In file included from ./test1.c:2:0:
/usr/include/linux/auto_fs.h:54:12: error: 'NAME_MAX' undeclared here (not in a function)
char name[NAME_MAX+1];
^
Link: http://lkml.kernel.org/r/20160812024856.12352.24092.stgit@pluto.themaw.net
Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com>
Signed-off-by: Ian Kent <ikent@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJX/QbjAAoJEAhfPr2O5OEVDKkP/30o73ZhzBkDR3xgApbmVdrw
1NQYZq8UKibZ87hv949535N3lwaHFV0mA8ylheu2MMArd1GoZvyXKqNbJN9316kQ
mSI8wVK77UiBP7RRolEepCuliQExNmayUm+kNZEZsF67+flilkcumCBqlPf114Sl
ruhpGTSAIz2mgbxGsPkFiN+4xl2AZFOjiiHsp9doBE8HAtEp3PyCrPv5T6zkK7PQ
KKf7ribcIB65tx0zBmhkfJOef/mqK/t7XgQS7kVRB3G4zr1nkh4g2iw/QbUreBtE
94p1VYAMBFfpCNe1rWaaBOxYRLsDBMQHz2LvOvj8HZKrsuBCKQQ4jAoYQ4bNi8cu
nWAb5Z19npoxJRYCGrPs8MJtCFD1IoT4zjiA8Ld5BT4SqBsCQ6VrgiUpQESzjtlj
Xp7V1D2ak3vx40FAuDGZsb7JwGTuIrK18rZyKSjvHbnydWiJlaHY9kR3lOe91wc2
MZOiD3K4lM5Lvse07nLVgOTjXW1fC3ScliRCQVLU/Wbm6A8UKiejES8sy0bFk9sU
8Go3RaAPVeQLGFLqOJG+6yu7sJ1FCZzAthKbpxtY8p/iKZE4QO0n4Y6Q2NjcjHJt
lDKYp83jne+AMthbLR+Ab6IL2GoOxaW6fnTrDioDxGc9Cvba90xYsZCIxbcGrM4h
cu1bOLUp5Ei1wHvaqRla
=JqCR
-----END PGP SIGNATURE-----
Merge tag 'media/v4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- Documentation improvements: conversion of all non-DocBook documents
to Sphinx and lots of fixes to the uAPI media book
- New PCI driver for Techwell TW5864 media grabber boards
- New SoC driver for ATMEL Image Sensor Controller
- Removal of some obsolete SoC drivers (s5p-tv driver and soc_camera
drivers)
- Addition of ST CEC driver
- Lots of drivers fixes, improvements and additions
* tag 'media/v4.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (464 commits)
[media] ttusb_dec: avoid the risk of go past buffer
[media] cx23885: Fix some smatch warnings
[media] si2165: switch to regmap
[media] si2165: use i2c_client->dev instead of i2c_adapter->dev for logging
[media] si2165: Remove legacy attach
[media] cx231xx: attach si2165 driver via i2c_client
[media] cx231xx: Prepare for attaching new style i2c_client DVB demod drivers
[media] cx23885: attach si2165 driver via i2c_client
[media] si2165: support i2c_client attach
[media] si2165: avoid division by zero
[media] rcar-vin: add R-Car gen2 fallback compatibility string
[media] lgdt3306a: remove 20*50 msec unnecessary timeout
[media] cx25821: Remove deprecated create_singlethread_workqueue
[media] cx25821: Drop Freeing of Workqueue
[media] cxd2841er: force 8MHz bandwidth for DVB-C if specified bw not supported
[media] redrat3: hardware-specific parameters
[media] redrat3: remove hw_timeout member
[media] cxd2841er: BER and SNR reading for ISDB-T
[media] dvb-usb: avoid link error with dib3000m{b,c|
[media] dvb-usb: split out common parts of dibusb
...
* PMEM sub-division support: Allow a single PMEM region to be divided
into multiple namespaces. Originally, ~2 years ago, it was thought that
partitions of a /dev/pmemX block device could handle sub-allocations of
persistent memory for different use cases. With the decision to not
support DAX mappings of raw block-devices, and the genesis of
device-dax, the need for having multiple pmem-namespace per region has
grown.
* Device-DAX unified inode: In support of dynamic-resizing of a
device-dax instance the kernel arranges for all mappings of a
device-dax node to share the same inode. This allows unmap / truncate /
invalidation events to affect all instances of the device similar to the
behavior of mmap on block devices.
* Hardware error scrubbing reworks: The original address-range-scrub +
badblocks tracking solution allowed clearing entries at the individual
namespace level, but it failed to clear the internal list of media
errors maintained at the bus level. The result was that the next scrub
or namespace disable/re-enable event would restore the cleared
badblocks, but now that is fixed. The v4.8 kernel introduced an
auto-scrub-on-machine-check behavior to repopulate the badblocks list.
Now, in v4.9, the auto-scrub behavior can be disabled and simply arrange
for the error reported in the machine-check to be added to the list.
* DIMM health-event notification support: ACPI 6.1 defines a
notification event code that can be send to ACPI NVDIMM devices. A
poll(2) capable file descriptor for these events can be obtained from
the nmemX/nfit/flags sysfs-attribute of a libnvdimm memory device.
* Miscellaneous fixes: NVDIMM-N probe error, device-dax build error, and
a change to dedup the flush hint list to not flush the memory controller
more than necessary.
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJX/B2oAAoJEB7SkWpmfYgCe3YQAJiH4ZYRxr6HeJzVQltbhB2k
qyLC+7vIssefPPqn/Wycc3aHJjyk2ktetmFyjYE1q/vlJJWCG3y/ACfz2SZANXXx
2tgLsI+3dXZaGgIxRsZF8MsB672owqCbzJHbbmTRu3EtgMplagfh27G7HFZxt4Jd
FyKnRkknYsCEbHry/s0aRcZWPmacu5v1TDJyWgd0edNTG32GrKOtwxWrWEPRDJE1
dIK5JjPaDwMFMKjV6lgRuBVlsMKCzIC4YjSYZZmN/Mf/JCJBJuPSlkYEdGZ+xx84
/ZmKrE/XRPr7469f66QyD8iRtGAQ9OparhChbuzCagCHRAwgYy4yQGbK7rk0lwUM
18jysZU8NJxp4jEJIt0u2ap6W9ySePX5Bm+3CSwqxT0Ernew2AUJDLIw9f1hAAbX
rippSWyHp0JtBTjOeaV2ZY1LJlm+J//AycbFo51lAERHoX5zPimHL730EM8mJu7y
fIbFpau3fjob+ovQMXMIYam8C/MpTqAvcjpBFhkSlsY7q/l+ARgFpjYpg9qVir8g
v6PZ0UoGBhQvD2lTNTUjaCaHOc+sjo8PLeNI1ZsFebh63rF3k5sOLOk7wXllf8z5
jQBnYtYnPCJI67BLLZmwWzoBb0HpCbcPp9/0/c1rdLTcAo+3gi6SY4pVJgznxCZZ
+fkeOvSutJ687tFMarc1
=SenK
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
"Aside from the recently added pmem sub-division support these have
been in -next for several releases with no reported issues. The sub-
division support was included in next-20161010 with no reported
issues. It passes all unit tests including new tests for all the new
functionality below.
Summary:
- PMEM sub-division support: Allow a single PMEM region to be divided
into multiple namespaces. Originally, ~2 years ago, it was thought
that partitions of a /dev/pmemX block device could handle
sub-allocations of persistent memory for different use cases. With
the decision to not support DAX mappings of raw block-devices, and
the genesis of device-dax, the need for having multiple
pmem-namespace per region has grown.
- Device-DAX unified inode: In support of dynamic-resizing of a
device-dax instance the kernel arranges for all mappings of a
device-dax node to share the same inode. This allows unmap /
truncate / invalidation events to affect all instances of the
device similar to the behavior of mmap on block devices.
- Hardware error scrubbing reworks: The original address-range-scrub
and badblocks tracking solution allowed clearing entries at the
individual namespace level, but it failed to clear the internal
list of media errors maintained at the bus level. The result was
that the next scrub or namespace disable/re-enable event would
restore the cleared badblocks, but now that is fixed. The v4.8
kernel introduced an auto-scrub-on-machine-check behavior to
repopulate the badblocks list. Now, in v4.9, the auto-scrub
behavior can be disabled and simply arrange for the error reported
in the machine-check to be added to the list.
- DIMM health-event notification support: ACPI 6.1 defines a
notification event code that can be send to ACPI NVDIMM devices. A
poll(2) capable file descriptor for these events can be obtained
from the nmemX/nfit/flags sysfs-attribute of a libnvdimm memory
device.
- Miscellaneous fixes: NVDIMM-N probe error, device-dax build error,
and a change to dedup the flush hint list to not flush the memory
controller more than necessary"
* tag 'libnvdimm-for-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (39 commits)
/dev/dax: fix Kconfig dependency build breakage
dax: use correct dev_t value
dax: convert devm_create_dax_dev to PTR_ERR
libnvdimm, namespace: allow creation of multiple pmem-namespaces per region
libnvdimm, namespace: lift single pmem limit in scan_labels()
libnvdimm, namespace: filter out of range labels in scan_labels()
libnvdimm, namespace: enable allocation of multiple pmem namespaces
libnvdimm, namespace: update label implementation for multi-pmem
libnvdimm, namespace: expand pmem device naming scheme for multi-pmem
libnvdimm, region: update nd_region_available_dpa() for multi-pmem support
libnvdimm, namespace: sort namespaces by dpa at init
libnvdimm, namespace: allow multiple pmem-namespaces per region at scan time
tools/testing/nvdimm: support for sub-dividing a pmem region
libnvdimm, namespace: unify blk and pmem label scanning
libnvdimm, namespace: refactor uuid_show() into a namespace_to_uuid() helper
libnvdimm, label: convert label tracking to a linked list
libnvdimm, region: move region-mapping input-paramters to nd_mapping_desc
nvdimm: reduce duplicated wpq flushes
libnvdimm: clear the internal poison_list when clearing badblocks
pmem: reduce kmap_atomic sections to the memcpys only
...
Pull misc vfs updates from Al Viro:
"Assorted misc bits and pieces.
There are several single-topic branches left after this (rename2
series from Miklos, current_time series from Deepa Dinamani, xattr
series from Andreas, uaccess stuff from from me) and I'd prefer to
send those separately"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (39 commits)
proc: switch auxv to use of __mem_open()
hpfs: support FIEMAP
cifs: get rid of unused arguments of CIFSSMBWrite()
posix_acl: uapi header split
posix_acl: xattr representation cleanups
fs/aio.c: eliminate redundant loads in put_aio_ring_file
fs/internal.h: add const to ns_dentry_operations declaration
compat: remove compat_printk()
fs/buffer.c: make __getblk_slow() static
proc: unsigned file descriptors
fs/file: more unsigned file descriptors
fs: compat: remove redundant check of nr_segs
cachefiles: Fix attempt to read i_blocks after deleting file [ver #2]
cifs: don't use memcpy() to copy struct iov_iter
get rid of separate multipage fault-in primitives
fs: Avoid premature clearing of capabilities
fs: Give dentry to inode_change_ok() instead of inode
fuse: Propagate dentry down to inode_change_ok()
ceph: Propagate dentry down to inode_change_ok()
xfs: Propagate dentry down to inode_change_ok()
...
Pull HID updates from Jiri Kosina:
- Integrated Sensor Hub support (Cherrytrail+) from Srinivas Pandruvada
- Big cleanup of Wacom driver; namely it's now using devres, and the
standardized LED API so that libinput doesn't need to have root
access any more, with substantial amount of other cleanups
piggy-backing on top. All this from Benjamin Tissoires
- Report descriptor parsing would now ignore and out-of-range System
controls in case of the application actually being System Control.
This fixes quite some issues with several devices, and allows us to
remove a few ->report_fixup callbacks. From Benjamin Tissoires
- ... a lot of other assorted small fixes and device ID additions
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (76 commits)
HID: add missing \n to end of dev_warn messages
HID: alps: fix multitouch cursor issue
HID: hid-logitech: Documentation updates/corrections
HID: hid-logitech: Improve Wingman Formula Force GP support
HID: hid-logitech: Rewrite of descriptor for all DF wheels
HID: hid-logitech: Compute combined pedals value
HID: hid-logitech: Add combined pedal support Logitech wheels
HID: hid-logitech: Introduce control for combined pedals feature
HID: sony: Update copyright and add Dualshock 4 rate control note
HID: sony: Defer the initial USB Sixaxis output report
HID: sony: Relax duplicate checking for USB-only devices
Revert "HID: microsoft: fix invalid rdesc for 3k kbd"
HID: alps: fix error return code in alps_input_configured()
HID: alps: fix stick device not working after resume
HID: support for keyboard - Corsair STRAFE
HID: alps: Fix memory leak
HID: uclogic: Add support for UC-Logic TWHA60 v3
HID: uclogic: Override constant descriptors
HID: uclogic: Support UGTizer GP0610 partially
HID: uclogic: Add support for several more tablets
...
Pull namespace updates from Eric Biederman:
"This set of changes is a number of smaller things that have been
overlooked in other development cycles focused on more fundamental
change. The devpts changes are small things that were a distraction
until we managed to kill off DEVPTS_MULTPLE_INSTANCES. There is an
trivial regression fix to autofs for the unprivileged mount changes
that went in last cycle. A pair of ioctls has been added by Andrey
Vagin making it is possible to discover the relationships between
namespaces when referring to them through file descriptors.
The big user visible change is starting to add simple resource limits
to catch programs that misbehave. With namespaces in general and user
namespaces in particular allowing users to use more kinds of
resources, it has become important to have something to limit errant
programs. Because the purpose of these limits is to catch errant
programs the code needs to be inexpensive to use as it always on, and
the default limits need to be high enough that well behaved programs
on well behaved systems don't encounter them.
To this end, after some review I have implemented per user per user
namespace limits, and use them to limit the number of namespaces. The
limits being per user mean that one user can not exhause the limits of
another user. The limits being per user namespace allow contexts where
the limit is 0 and security conscious folks can remove from their
threat anlysis the code used to manage namespaces (as they have
historically done as it root only). At the same time the limits being
per user namespace allow other parts of the system to use namespaces.
Namespaces are increasingly being used in application sand boxing
scenarios so an all or nothing disable for the entire system for the
security conscious folks makes increasing use of these sandboxes
impossible.
There is also added a limit on the maximum number of mounts present in
a single mount namespace. It is nontrivial to guess what a reasonable
system wide limit on the number of mount structure in the kernel would
be, especially as it various based on how a system is using
containers. A limit on the number of mounts in a mount namespace
however is much easier to understand and set. In most cases in
practice only about 1000 mounts are used. Given that some autofs
scenarious have the potential to be 30,000 to 50,000 mounts I have set
the default limit for the number of mounts at 100,000 which is well
above every known set of users but low enough that the mount hash
tables don't degrade unreaonsably.
These limits are a start. I expect this estabilishes a pattern that
other limits for resources that namespaces use will follow. There has
been interest in making inotify event limits per user per user
namespace as well as interest expressed in making details about what
is going on in the kernel more visible"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (28 commits)
autofs: Fix automounts by using current_real_cred()->uid
mnt: Add a per mount namespace limit on the number of mounts
netns: move {inc,dec}_net_namespaces into #ifdef
nsfs: Simplify __ns_get_path
tools/testing: add a test to check nsfs ioctl-s
nsfs: add ioctl to get a parent namespace
nsfs: add ioctl to get an owning user namespace for ns file descriptor
kernel: add a helper to get an owning user namespace for a namespace
devpts: Change the owner of /dev/pts/ptmx to the mounter of /dev/pts
devpts: Remove sync_filesystems
devpts: Make devpts_kill_sb safe if fsi is NULL
devpts: Simplify devpts_mount by using mount_nodev
devpts: Move the creation of /dev/pts/ptmx into fill_super
devpts: Move parse_mount_options into fill_super
userns: When the per user per user namespace limit is reached return ENOSPC
userns; Document per user per user namespace limits.
mntns: Add a limit on the number of mount namespaces.
netns: Add a limit on the number of net namespaces
cgroupns: Add a limit on the number of cgroup namespaces
ipcns: Add a limit on the number of ipc namespaces
...
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJX8Zc4AAoJEHm+PkMAQRiGQG8H/2Hd4IwJh75snGY5LAiWt6ra
kGM/SobvLAMtcoxXCeHqf2bZrxa2Zz9tnEzhuLMGaf9a3l79xHa8YumK5KS1JPGV
6lZBvuPi8BIyT0cpYH000e5ehHfhP6pSGJKZ2EuLv43HcBeVZEGAf3/8jSAlNA15
bwFy2ZEkwJGThbnT6au0Y3s9h8LcNjtllu9hjfb+/9oNGvp8r4QhdVodIqIQ4cv6
SeUfv7Pn2LZDMCSaTP9bh2KaR4dwYZHFsVe75x2wND5Sfq7DVBBfFkAoV/RwJDTM
gBc3PNnmzb/tix6ohOrSQnSiGsXv1uASxvHH3RD2zG6g7Aj9Eq/+Z7ZdPu2+o+U=
=U+ef
-----END PGP SIGNATURE-----
Merge tag 'v4.8' into patchwork
Linux 4.8
* tag 'v4.8': (1761 commits)
Linux 4.8
ARM: 8618/1: decompressor: reset ttbcr fields to use TTBR0 on ARMv7
MIPS: CM: Fix mips_cm_max_vp_width for non-MT kernels on MT systems
include/linux/property.h: fix typo/compile error
ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()
mm: workingset: fix crash in shadow node shrinker caused by replace_page_cache_page()
MAINTAINERS: Switch to kernel.org email address for Javi Merino
x86/entry/64: Fix context tracking state warning when load_gs_index fails
x86/boot: Initialize FPU and X86_FEATURE_ALWAYS even if we don't have CPUID
x86/vdso: Fix building on big endian host
x86/boot: Fix another __read_cr4() case on 486
sctp: fix the issue sctp_diag uses lock_sock in rcu_read_lock
sctp: change to check peer prsctp_capable when using prsctp polices
sctp: remove prsctp_param from sctp_chunk
sctp: move sent_count to the memory hole in sctp_chunk
tg3: Avoid NULL pointer dereference in tg3_io_error_detected()
x86/init: Fix cr4_init_shadow() on CR4-less machines
MIPS: Fix detection of unsupported highmem with cache aliases
MIPS: Malta: Fix IOCU disable switch read for MIPS64
MIPS: Fix BUILD_ROLLBACK_PROLOGUE for microMIPS
...
Pull fuse updates from Miklos Szeredi:
"This adds POSIX ACL permission checking to the fuse kernel module.
In addition there are minor bug fixes as well as cleanups"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: limit xattr returned size
fuse: remove duplicate cs->offset assignment
fuse: don't use fuse_ioctl_copy_user() helper
fuse_ioctl_copy_user(): don't open-code copy_page_{to,from}_iter()
fuse: get rid of fc->flags
fuse: use timespec64
fuse: don't use ->d_time
fuse: Add posix ACL support
fuse: handle killpriv in userspace fs
fuse: fix killing s[ug]id in setattr
fuse: invalidate dir dentry after chmod
fuse: Use generic xattr ops
fuse: listxattr: verify xattr list
Pull networking updates from David Miller:
1) BBR TCP congestion control, from Neal Cardwell, Yuchung Cheng and
co. at Google. https://lwn.net/Articles/701165/
2) Do TCP Small Queues for retransmits, from Eric Dumazet.
3) Support collect_md mode for all IPV4 and IPV6 tunnels, from Alexei
Starovoitov.
4) Allow cls_flower to classify packets in ip tunnels, from Amir Vadai.
5) Support DSA tagging in older mv88e6xxx switches, from Andrew Lunn.
6) Support GMAC protocol in iwlwifi mwm, from Ayala Beker.
7) Support ndo_poll_controller in mlx5, from Calvin Owens.
8) Move VRF processing to an output hook and allow l3mdev to be
loopback, from David Ahern.
9) Support SOCK_DESTROY for UDP sockets. Also from David Ahern.
10) Congestion control in RXRPC, from David Howells.
11) Support geneve RX offload in ixgbe, from Emil Tantilov.
12) When hitting pressure for new incoming TCP data SKBs, perform a
partial rathern than a full purge of the OFO queue (which could be
huge). From Eric Dumazet.
13) Convert XFRM state and policy lookups to RCU, from Florian Westphal.
14) Support RX network flow classification to igb, from Gangfeng Huang.
15) Hardware offloading of eBPF in nfp driver, from Jakub Kicinski.
16) New skbmod packet action, from Jamal Hadi Salim.
17) Remove some inefficiencies in snmp proc output, from Jia He.
18) Add FIB notifications to properly propagate route changes to
hardware which is doing forwarding offloading. From Jiri Pirko.
19) New dsa driver for qca8xxx chips, from John Crispin.
20) Implement RFC7559 ipv6 router solicitation backoff, from Maciej
Żenczykowski.
21) Add L3 mode to ipvlan, from Mahesh Bandewar.
22) Support 802.1ad in mlx4, from Moshe Shemesh.
23) Support hardware LRO in mediatek driver, from Nelson Chang.
24) Add TC offloading to mlx5, from Or Gerlitz.
25) Convert various drivers to ethtool ksettings interfaces, from
Philippe Reynes.
26) TX max rate limiting for cxgb4, from Rahul Lakkireddy.
27) NAPI support for ath10k, from Rajkumar Manoharan.
28) Support XDP in mlx5, from Rana Shahout and Saeed Mahameed.
29) UDP replicast support in TIPC, from Richard Alpe.
30) Per-queue statistics for qed driver, from Sudarsana Reddy Kalluru.
31) Support BQL in thunderx driver, from Sunil Goutham.
32) TSO support in alx driver, from Tobias Regnery.
33) Add stream parser engine and use it in kcm.
34) Support async DHCP replies in ipconfig module, from Uwe
Kleine-König.
35) DSA port fast aging for mv88e6xxx driver, from Vivien Didelot.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1715 commits)
mlxsw: switchx2: Fix misuse of hard_header_len
mlxsw: spectrum: Fix misuse of hard_header_len
net/faraday: Stop NCSI device on shutdown
net/ncsi: Introduce ncsi_stop_dev()
net/ncsi: Rework the channel monitoring
net/ncsi: Allow to extend NCSI request properties
net/ncsi: Rework request index allocation
net/ncsi: Don't probe on the reserved channel ID (0x1f)
net/ncsi: Introduce NCSI_RESERVED_CHANNEL
net/ncsi: Avoid unused-value build warning from ia64-linux-gcc
net: Add netdev all_adj_list refcnt propagation to fix panic
net: phy: Add Edge-rate driver for Microsemi PHYs.
vmxnet3: Wake queue from reset work
i40e: avoid NULL pointer dereference and recursive errors on early PCI error
qed: Add RoCE ll2 & GSI support
qed: Add support for memory registeration verbs
qed: Add support for QP verbs
qed: PD,PKEY and CQ verb support
qed: Add support for RoCE hw init
qede: Add qedr framework
...
Pull audit updates from Paul Moore:
"Another relatively small pull request for v4.9 with just two patches.
The patch from Richard updates the list of features we support and
report back to userspace; this should have been sent earlier with the
rest of the v4.8 patches but it got lost in my inbox.
The second patch fixes a problem reported by our Android friends where
we weren't very consistent in recording PIDs"
* 'stable-4.9' of git://git.infradead.org/users/pcmoore/audit:
audit: add exclude filter extension to feature bitmap
audit: consistently record PIDs with task_tgid_nr()
Resolve the merge conflict between Felix's/my and Toke's patches
coming into the tree through net and mac80211-next respectively.
Most of Felix's changes go away due to Toke's new infrastructure
work, my patch changes to "goto begin" (the label wasn't there
before) instead of returning NULL so flow control towards drivers
is preserved better.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
As part of the sync framework destaging, the sync_file.h header
was moved, but an entry was not added on Kbuild to install it.
This patch resolves this omission so that "make headers_install"
installs this header.
Fixes: 460bfc41fd ("dma-buf/sync_file: de-stage sync_file headers")
Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Emilio López <emilio.lopez@collabora.co.uk>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20160927143142.8975-1-emilio.lopez@collabora.co.uk
Here is the big USB, and PHY, and extcon, patchsets for 4.9-rc1.
Full details are in the shortlog, but generally a lot of new hardware
support, usb gadget updates, and Wolfram's great cleanup of USB error
message handling, making the kernel image a tad bit smaller.
All of this has been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iFYEABECABYFAlfyNTEPHGdyZWdAa3JvYWguY29tAAoJEDFH1A3bLfspbuUAoJAn
XD6k9A+0QgnJ/iLiT8pztawNAKCCVYZOzgdFRGsnaZ2p7lb9IHpPCA==
=QUj+
-----END PGP SIGNATURE-----
Merge tag 'usb-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull usb/phy/extcon updates from Greg KH:
"Here is the big USB, and PHY, and extcon, patchsets for 4.9-rc1.
Full details are in the shortlog, but generally a lot of new hardware
support, usb gadget updates, and Wolfram's great cleanup of USB error
message handling, making the kernel image a tad bit smaller.
All of this has been in linux-next with no reported issues"
* tag 'usb-4.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (343 commits)
Revert "usbtmc: convert to devm_kzalloc"
USB: serial: cp210x: Add ID for a Juniper console
usb: Kconfig: using select for USB_COMMON dependency
bluetooth: bcm203x: don't print error when allocating urb fails
mmc: host: vub300: don't print error when allocating urb fails
usb: hub: change CLEAR_FEATURE to SET_FEATURE
usb: core: Introduce a USB port LED trigger
USB: bcma: drop Northstar PHY 2.0 initialization code
usb: core: hcd: add missing header dependencies
usb: musb: da8xx: fix error handling message in probe
usb: musb: Fix session based PM for first invalid VBUS
usb: musb: Fix PM runtime for disconnect after unconfigure
musb: Export musb_root_disconnect for use in modules
usb: misc: legousbtower: Fix NULL pointer deference
cdc-acm: hardening against malicious devices
Revert "usb: gadget: NCM: Protect dev->port_usb using dev->lock"
include: extcon: Fix compilation error caused because of incomplete merge
MAINTAINERS: add tree entry for USB Serial
phy-twl4030-usb: initialize charging-related stuff via pm_runtime
phy-twl4030-usb: better handle musb_mailbox() failure
...
There are two separate issues that can lead to corrupted free space
trees.
1. The free space tree bitmaps had an endianness issue on big-endian
systems which is fixed by an earlier patch in this series.
2. btrfs-progs before v4.7.3 modified filesystems without updating the
free space tree.
To catch both of these issues at once, we need to force the free space
tree to be rebuilt. To do so, add a FREE_SPACE_TREE_VALID compat_ro bit.
If the bit isn't set, we know that it was either produced by a broken
big-endian kernel or may have been corrupted by btrfs-progs.
This also provides us with a way to add rudimentary read-write support
for the free space tree to btrfs-progs: it can just clear this bit and
have the kernel rebuild the free space tree.
Cc: stable@vger.kernel.org # 4.5+
Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Tested-by: Chandan Rajendra <chandan@linux.vnet.ibm.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add a new fallocate mode flag that explicitly unshares blocks on
filesystems that support such features. The new flag can only
be used with an allocate-mode fallocate call.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Introduce XFLAGs for the new XFS CoW extent size hint, and actually
plumb the CoW extent size hint into the fsxattr structure.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Quadrature encoders, such as rotary encoders and linear encoders, are
devices which are capable of encoding the relative position and
direction of motion of a shaft. This patch introduces several IIO
constants for supporting quadrature encoder counter devices.
IIO_COUNT: Current count (main data provided by the counter device)
IIO_INDEX: Counter device index value
Signed-off-by: William Breathitt Gray <vilhelm.gray@gmail.com>
Signed-off-by: Jonathan Cameron <jic23@kernel.org>
Add a new INIT flag, FUSE_POSIX_ACL, for negotiating ACL support with
userspace. When it is set in the INIT response, ACL support will be
enabled. ACL support also implies "default_permissions".
When ACL support is enabled, the kernel will cache and have responsibility
for enforcing ACLs. ACL xattrs will be passed to userspace, which is
responsible for updating the ACLs in the filesystem, keeping the file mode
in sync, and inheritance of default ACLs when new filesystem nodes are
created.
Signed-off-by: Seth Forshee <seth.forshee@canonical.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Only userspace filesystem can do the killing of suid/sgid without races.
So introduce an INIT flag and negotiate support for this.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Provide a function that reports NAN DE function termination. The function
may be terminated due to one of the following reasons: user request,
ttl expiration or failure.
If the NAN instance is tied to the owner, the notification will be
sent to the socket that started the NAN interface only
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Provide a function the driver can call to report a match.
This will send the event to the user space.
If the NAN instance is tied to the owner, the notifications will be
sent to the socket that started the NAN interface only.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Some NAN configuration paramaters may change during the operation of
the NAN device. For example, a user may want to update master preference
value when the device gets plugged/unplugged to the power.
Add API that allows to do so.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
A NAN function can be either publish, subscribe or follow
up. Make all the necessary verifications and just pass the
request to the driver.
Allow the user space application that starts NAN to
forbid any other socket to add or remove functions.
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Ayala Beker <ayala.beker@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This allows user space to start/stop NAN interface.
A NAN interface is like P2P device in a few aspects: it
doesn't have a netdev associated to it.
Add the new interface type and prevent operations that
can't be executed on NAN interface like scan.
Define several attributes that may be configured by user space
when starting NAN functionality (master preference and dual
band operation)
Signed-off-by: Andrei Otcheretianski <andrei.otcheretianski@intel.com>
Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
This implements:
https://tools.ietf.org/html/rfc7559
Backoff is performed according to RFC3315 section 14:
https://tools.ietf.org/html/rfc3315#section-14
We allow setting /proc/sys/net/ipv6/conf/*/router_solicitations
to a negative value meaning an unlimited number of retransmits,
and we make this the new default (inline with the RFC).
We also add a new setting:
/proc/sys/net/ipv6/conf/*/router_solicitation_max_interval
defaulting to 1 hour (per RFC recommendation).
Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Erik Kline <ek@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a simple device driver to expose the iBT interface on
Aspeed SOCs (AST2400 and AST2500) as a character device. Such SOCs are
commonly used as BMCs (BaseBoard Management Controllers) and this
driver implements the BMC side of the BT interface.
The BT (Block Transfer) interface is used to perform in-band IPMI
communication between a host and its BMC. Entire messages are buffered
before sending a notification to the other end, host or BMC, that
there is data to be read. Usually, the host emits requests and the BMC
responses but the specification provides a mean for the BMC to send
SMS Attention (BMC-to-Host attention or System Management Software
attention) messages.
For this purpose, the driver introduces a specific ioctl on the
device: 'BT_BMC_IOCTL_SMS_ATN' that can be used by the system running
on the BMC to signal the host of such an event.
The device name defaults to '/dev/ipmi-bt-host'
Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
[clg: - checkpatch fixes
- added a devicetree binding documentation
- replace 'bt_host' by 'bt_bmc' to reflect that the driver is
the BMC side of the IPMI BT interface
- renamed the device to 'ipmi-bt-host'
- introduced a temporary buffer to copy_{to,from}_user
- used platform_get_irq()
- moved the driver under drivers/char/ipmi/ but kept it as a misc
device
- changed the compatible cell to "aspeed,ast2400-bt-bmc"
]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
[clg: - checkpatch --strict fixes
- removed the use of devm_iounmap, devm_kfree in cleanup paths
- introduced an atomic-t to limit opens to 1
- introduced a mutex to protect write/read operations]
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Add to the audit feature bitmap to indicate availability of the
extension of the exclude filter to include PID, UID, AUID, GID, SUBJ_*.
RFE: add additional fields for use in audit filter exclude rules
https://github.com/linux-audit/audit-kernel/issues/5
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJX6H4uAAoJEHm+PkMAQRiG5sMH/3yzrMiUCSokdS+cvY+jgKAG
JS58JmRvBPz2mRaU3MRPBGRDeCz/Nc9LggL2ZcgM+E1ZYirlYyQfIED3lkqk5R07
kIN1wmb+kQhXyU4IY3fEX7joqyKC6zOy4DUChPkBQU0/0+VUmdVmcJvsuPlnMZtf
g95m0BdYTui+eDezASRqOEp3Lb5ONL4c3ao4yBP0LHF033ctj3VJQiyi5uERPZJ0
5e6Mo7Wxn78t9WqJLQAiEH46kTwT2plNlxf3XXqTenfIdbWhqE873HPGeSMa3VQV
VywXTpCpSPQsA8BYg66qIbebdKOhs9MOviHVfqDtwQlvwhjlBDya0gNHfI5fSy4=
=Y/L5
-----END PGP SIGNATURE-----
Merge tag 'v4.8-rc8' into drm-next
Linux 4.8-rc8
There was a lot of fallout in the imx/amdgpu/i915 drivers, so backmerge
it now to avoid troubles.
* tag 'v4.8-rc8': (1442 commits)
Linux 4.8-rc8
fault_in_multipages_readable() throws set-but-unused error
mm: check VMA flags to avoid invalid PROT_NONE NUMA balancing
radix tree: fix sibling entry handling in radix_tree_descend()
radix tree test suite: Test radix_tree_replace_slot() for multiorder entries
fix memory leaks in tracing_buffers_splice_read()
tracing: Move mutex to protect against resetting of seq data
MIPS: Fix delay slot emulation count in debugfs
MIPS: SMP: Fix possibility of deadlock when bringing CPUs online
mm: delete unnecessary and unsafe init_tlb_ubc()
huge tmpfs: fix Committed_AS leak
shmem: fix tmpfs to handle the huge= option properly
blk-mq: skip unmapped queues in blk_mq_alloc_request_hctx
MIPS: Fix pre-r6 emulation FPU initialisation
arm64: kgdb: handle read-only text / modules
arm64: Call numa_store_cpu_info() earlier.
locking/hung_task: Fix typo in CONFIG_DETECT_HUNG_TASK help text
nvme-rdma: only clear queue flags after successful connect
i2c: qup: skip qup_i2c_suspend if the device is already runtime suspended
perf/core: Limit matching exclusive events to one PMU
...
Export the base definitions and the xattr representation of POSIX ACLs
to user space.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The Altera 16550 soft IP UART requires 2 additional registers for
TX FIFO threshold support. These 2 registers enable the TX FIFO
Low Watermark and set the TX FIFO Low Watermark.
Set the TX FIFO threshold to the FIFO size - tx_loadsz.
Signed-off-by: Thor Thayer <tthayer@opensource.altera.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
The previous commit added support for specifying the beacon rate
for AP mode. Add features checks to this, and extend it to also
support the rate configuration for mesh networks. For IBSS it's
not as simple due to joining etc., so that's not yet supported.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Conflicts:
net/netfilter/core.c
net/netfilter/nf_tables_netdev.c
Resolve two conflicts before pull request for David's net-next tree:
1) Between c73c248490 ("netfilter: nf_tables_netdev: remove redundant
ip_hdr assignment") from the net tree and commit ddc8b6027a
("netfilter: introduce nft_set_pktinfo_{ipv4, ipv6}_validate()").
2) Between e8bffe0cf9 ("net: Add _nf_(un)register_hooks symbols") and
Aaron Conole's patches to replace list_head with single linked list.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
NFTA_LOG_FLAGS attribute is already supported, but the related
NF_LOG_XXX flags are not exposed to the userspace. So we cannot
explicitly enable log flags to log uid, tcp sequence, ip options
and so on, i.e. such rule "nft add rule filter output log uid"
is not supported yet.
So move NF_LOG_XXX macro definitions to the uapi/../nf_log.h. In
order to keep consistent with other modules, change NF_LOG_MASK to
refer to all supported log flags. On the other hand, add a new
NF_LOG_DEFAULT_MASK to refer to the original default log flags.
Finally, if user specify the unsupported log flags or NFTA_LOG_GROUP
and NFTA_LOG_FLAGS are set at the same time, report EINVAL to the
userspace.
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Inverse ranges != [a,b] are not currently possible because rules are
composites of && operations, and we need to express this:
data < a || data > b
This patch adds a new range expression. Positive ranges can be already
through two cmp expressions:
cmp(sreg, data, >=)
cmp(sreg, data, <=)
This new range expression provides an alternative way to express this.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Create a new revision for the hashlimit iptables extension module. Rev 2
will support higher pps of upto 1 million, Version 1 supports only 10k.
To support this we have to increase the size of the variables avg and
burst in hashlimit_cfg to 64-bit. Create two new structs hashlimit_cfg2
and xt_hashlimit_mtinfo2 and also create newer versions of all the
functions for match, checkentry and destroy.
Some of the functions like hashlimit_mt, hashlimit_mt_check etc are very
similar in both rev1 and rev2 with only minor changes, so I have split
those functions and moved all the common code to a *_common function.
Signed-off-by: Vishwanath Pai <vpai@akamai.com>
Signed-off-by: Joshua Hunt <johunt@akamai.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Steffen Klassert says:
====================
pull request (net-next): ipsec-next 2016-09-23
Only two patches this time:
1) Fix a comment reference to struct xfrm_replay_state_esn.
From Richard Guy Briggs.
2) Convert xfrm_state_lookup to rcu, we don't need the
xfrm_state_lock anymore in the input path.
From Florian Westphal.
Please pull or let me know if there are problems.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce new rtnl UAPI that exposes a list of vlans per VF, giving
the ability for user-space application to specify it for the VF, as an
option to support 802.1ad.
We adjusted IP Link tool to support this option.
For future use cases, the new UAPI supports multiple vlans. For now we
limit the list size to a single vlan in kernel.
Add IFLA_VF_VLAN_LIST in addition to IFLA_VF_VLAN to keep backward
compatibility with older versions of IP Link tool.
Add a vlan protocol parameter to the ndo_set_vf_vlan callback.
We kept 802.1Q as the drivers' default vlan protocol.
Suitable ip link tool command examples:
Set vf vlan protocol 802.1ad:
ip link set eth0 vf 1 vlan 100 proto 802.1ad
Set vf to VST (802.1Q) mode:
ip link set eth0 vf 1 vlan 100 proto 802.1Q
Or by omitting the new parameter
ip link set eth0 vf 1 vlan 100
Signed-off-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a small helper that complements 36bbef52c7 ("bpf: direct packet
write and access for helpers for clsact progs") for invalidating the
current skb->hash after mangling on headers via direct packet write.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
It looks like the following patch can make FQ very precise, even in VM
or stressed hosts. It matters at high pacing rates.
We take into account the difference between the time that was programmed
when last packet was sent, and current time (a drift of tens of usecs is
often observed)
Add an EWMA of the unthrottle latency to help diagnostics.
This latency is the difference between current time and oldest packet in
delayed RB-tree. This accounts for the high resolution timer latency,
but can be different under stress, as fq_check_throttled() can be
opportunistically be called from a dequeue() called after an enqueue()
for a different flow.
Tested:
// Start a 10Gbit flow
$ netperf --google-pacing-rate 1250000000 -H lpaa24 -l 10000 -- -K bbr &
Before patch :
$ sar -n DEV 10 5 | grep eth0 | grep Average
Average: eth0 17106.04 756876.84 1102.75 1119049.02 0.00 0.00 0.52
After patch :
$ sar -n DEV 10 5 | grep eth0 | grep Average
Average: eth0 17867.00 800245.90 1151.77 1183172.12 0.00 0.00 0.52
A new iproute2 tc can output the 'unthrottle latency' :
$ tc -s qd sh dev eth0 | grep latency
0 gc, 0 highprio, 32490767 throttled, 2382 ns latency
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, the user can specify the queue numbers by _QUEUE_NUM and
_QUEUE_TOTAL attributes, this is enough in most situations.
But acctually, it is not very flexible, for example:
tcp dport 80 mapped to queue0
tcp dport 81 mapped to queue1
tcp dport 82 mapped to queue2
In order to do this thing, we must add 3 nft rules, and more
mapping meant more rules ...
So take one register to select the queue number, then we can add one
simple rule to mapping queues, maybe like this:
queue num tcp dport map { 80:0, 81:1, 82:2 ... }
Florian Westphal also proposed wider usage scenarios:
queue num jhash ip saddr . ip daddr mod ...
queue num meta cpu ...
queue num meta mark ...
The last point is how to load a queue number from sreg, although we can
use *(u16*)®s->data[reg] to load the queue number, just like nat expr
to load its l4port do.
But we will cooperate with hash expr, meta cpu, meta mark expr and so on.
They all store the result to u32 type, so cast it to u16 pointer and
dereference it will generate wrong result in the big endian system.
So just keep it simple, we treat queue number as u32 type, although u16
type is already enough.
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Liping Zhang <liping.zhang@spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pid and user namepaces are hierarchical. There is no way to discover
parent-child relationships.
In a future we will use this interface to dump and restore nested
namespaces.
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Each namespace has an owning user namespace and now there is not way
to discover these relationships.
Understending namespaces relationships allows to answer the question:
what capability does process X have to perform operations on a resource
governed by namespace Y?
After a long discussion, Eric W. Biederman proposed to use ioctl-s for
this purpose.
The NS_GET_USERNS ioctl returns a file descriptor to an owning user
namespace.
It returns EPERM if a target namespace is outside of a current user
namespace.
v2: rename parent to relative
v3: Add a missing mntput when returning -EAGAIN --EWB
Acked-by: Serge Hallyn <serge@hallyn.com>
Link: https://lkml.org/lkml/2016/7/6/158
Signed-off-by: Andrei Vagin <avagin@openvz.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Add support of an offset value for incremental counter and random. With
this option the sysadmin is able to start the counter to a certain value
and then apply the generated number.
Example:
meta mark set numgen inc mod 2 offset 100
This will generate marks with the serie 100, 101, 100, 101, ...
Suggested-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
TCA_VLAN_ACT_MODIFY allows one to change an existing tag.
It accepts same attributes as TCA_VLAN_ACT_PUSH (protocol, id,
priority).
If packet is vlan tagged, then the tag gets overwritten according to
user specified attributes.
For example, this allows user to replace a tag's vid while preserving
its priority bits (as opposed to "action vlan pop pipe action vlan push").
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add cls_bpf support for the TCA_CLS_FLAGS_SKIP_HW flag.
Unlike U32 and flower cls_bpf already has some netlink
flags defined. Create a new attribute to be able to use
the same flag values as the above.
Unlike U32 and flower reject unknown flags.
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit implements a new TCP congestion control algorithm: BBR
(Bottleneck Bandwidth and RTT). A detailed description of BBR will be
published in ACM Queue, Vol. 14 No. 5, September-October 2016, as
"BBR: Congestion-Based Congestion Control".
BBR has significantly increased throughput and reduced latency for
connections on Google's internal backbone networks and google.com and
YouTube Web servers.
BBR requires only changes on the sender side, not in the network or
the receiver side. Thus it can be incrementally deployed on today's
Internet, or in datacenters.
The Internet has predominantly used loss-based congestion control
(largely Reno or CUBIC) since the 1980s, relying on packet loss as the
signal to slow down. While this worked well for many years, loss-based
congestion control is unfortunately out-dated in today's networks. On
today's Internet, loss-based congestion control causes the infamous
bufferbloat problem, often causing seconds of needless queuing delay,
since it fills the bloated buffers in many last-mile links. On today's
high-speed long-haul links using commodity switches with shallow
buffers, loss-based congestion control has abysmal throughput because
it over-reacts to losses caused by transient traffic bursts.
In 1981 Kleinrock and Gale showed that the optimal operating point for
a network maximizes delivered bandwidth while minimizing delay and
loss, not only for single connections but for the network as a
whole. Finding that optimal operating point has been elusive, since
any single network measurement is ambiguous: network measurements are
the result of both bandwidth and propagation delay, and those two
cannot be measured simultaneously.
While it is impossible to disambiguate any single bandwidth or RTT
measurement, a connection's behavior over time tells a clearer
story. BBR uses a measurement strategy designed to resolve this
ambiguity. It combines these measurements with a robust servo loop
using recent control systems advances to implement a distributed
congestion control algorithm that reacts to actual congestion, not
packet loss or transient queue delay, and is designed to converge with
high probability to a point near the optimal operating point.
In a nutshell, BBR creates an explicit model of the network pipe by
sequentially probing the bottleneck bandwidth and RTT. On the arrival
of each ACK, BBR derives the current delivery rate of the last round
trip, and feeds it through a windowed max-filter to estimate the
bottleneck bandwidth. Conversely it uses a windowed min-filter to
estimate the round trip propagation delay. The max-filtered bandwidth
and min-filtered RTT estimates form BBR's model of the network pipe.
Using its model, BBR sets control parameters to govern sending
behavior. The primary control is the pacing rate: BBR applies a gain
multiplier to transmit faster or slower than the observed bottleneck
bandwidth. The conventional congestion window (cwnd) is now the
secondary control; the cwnd is set to a small multiple of the
estimated BDP (bandwidth-delay product) in order to allow full
utilization and bandwidth probing while bounding the potential amount
of queue at the bottleneck.
When a BBR connection starts, it enters STARTUP mode and applies a
high gain to perform an exponential search to quickly probe the
bottleneck bandwidth (doubling its sending rate each round trip, like
slow start). However, instead of continuing until it fills up the
buffer (i.e. a loss), or until delay or ACK spacing reaches some
threshold (like Hystart), it uses its model of the pipe to estimate
when that pipe is full: it estimates the pipe is full when it notices
the estimated bandwidth has stopped growing. At that point it exits
STARTUP and enters DRAIN mode, where it reduces its pacing rate to
drain the queue it estimates it has created.
Then BBR enters steady state. In steady state, PROBE_BW mode cycles
between first pacing faster to probe for more bandwidth, then pacing
slower to drain any queue that created if no more bandwidth was
available, and then cruising at the estimated bandwidth to utilize the
pipe without creating excess queue. Occasionally, on an as-needed
basis, it sends significantly slower to probe for RTT (PROBE_RTT
mode).
BBR has been fully deployed on Google's wide-area backbone networks
and we're experimenting with BBR on Google.com and YouTube on a global
scale. Replacing CUBIC with BBR has resulted in significant
improvements in network latency and application (RPC, browser, and
video) metrics. For more details please refer to our upcoming ACM
Queue publication.
Example performance results, to illustrate the difference between BBR
and CUBIC:
Resilience to random loss (e.g. from shallow buffers):
Consider a netperf TCP_STREAM test lasting 30 secs on an emulated
path with a 10Gbps bottleneck, 100ms RTT, and 1% packet loss
rate. CUBIC gets 3.27 Mbps, and BBR gets 9150 Mbps (2798x higher).
Low latency with the bloated buffers common in today's last-mile links:
Consider a netperf TCP_STREAM test lasting 120 secs on an emulated
path with a 10Mbps bottleneck, 40ms RTT, and 1000-packet bottleneck
buffer. Both fully utilize the bottleneck bandwidth, but BBR
achieves this with a median RTT 25x lower (43 ms instead of 1.09
secs).
Our long-term goal is to improve the congestion control algorithms
used on the Internet. We are hopeful that BBR can help advance the
efforts toward this goal, and motivate the community to do further
research.
Test results, performance evaluations, feedback, and BBR-related
discussions are very welcome in the public e-mail list for BBR:
https://groups.google.com/forum/#!forum/bbr-dev
NOTE: BBR *must* be used with the fq qdisc ("man tc-fq") with pacing
enabled, since pacing is integral to the BBR design and
implementation. BBR without pacing would not function properly, and
may incur unnecessary high packet loss rates.
Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit export two new fields in struct tcp_info:
tcpi_delivery_rate: The most recent goodput, as measured by
tcp_rate_gen(). If the socket is limited by the sending
application (e.g., no data to send), it reports the highest
measurement instead of the most recent. The unit is bytes per
second (like other rate fields in tcp_info).
tcpi_delivery_rate_app_limited: A boolean indicating if the goodput
was measured when the socket's throughput was limited by the
sending application.
This delivery rate information can be useful for applications that
want to know the current throughput the TCP connection is seeing,
e.g. adaptive bitrate video streaming. It can also be very useful for
debugging or troubleshooting.
Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This commit adds to the fq module a low_rate_threshold parameter to
insert a delay after all packets if the socket requests a pacing rate
below the threshold.
This helps achieve more precise control of the sending rate with
low-rate paths, especially policers. The basic issue is that if a
congestion control module detects a policer at a certain rate, it may
want fq to be able to shape to that policed rate. That way the sender
can avoid policer drops by having the packets arrive at the policer at
or just under the policed rate.
The default threshold of 550Kbps was chosen analytically so that for
policers or links at 500Kbps or 512Kbps fq would very likely invoke
this mechanism, even if the pacing rate was briefly slightly above the
available bandwidth. This value was then empirically validated with
two years of production testing on YouTube video servers.
Signed-off-by: Van Jacobson <vanj@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This work implements direct packet access for helpers and direct packet
write in a similar fashion as already available for XDP types via commits
4acf6c0b84 ("bpf: enable direct packet data write for xdp progs") and
6841de8b0d ("bpf: allow helpers access the packet directly"), and as a
complementary feature to the already available direct packet read for tc
(cls/act) programs.
For enabling this, we need to introduce two helpers, bpf_skb_pull_data()
and bpf_csum_update(). The first is generally needed for both, read and
write, because they would otherwise only be limited to the current linear
skb head. Usually, when the data_end test fails, programs just bail out,
or, in the direct read case, use bpf_skb_load_bytes() as an alternative
to overcome this limitation. If such data sits in non-linear parts, we
can just pull them in once with the new helper, retest and eventually
access them.
At the same time, this also makes sure the skb is uncloned, which is, of
course, a necessary condition for direct write. As this needs to be an
invariant for the write part only, the verifier detects writes and adds
a prologue that is calling bpf_skb_pull_data() to effectively unclone the
skb from the very beginning in case it is indeed cloned. The heuristic
makes use of a similar trick that was done in 233577a220 ("net: filter:
constify detection of pkt_type_offset"). This comes at zero cost for other
programs that do not use the direct write feature. Should a program use
this feature only sparsely and has read access for the most parts with,
for example, drop return codes, then such write action can be delegated
to a tail called program for mitigating this cost of potential uncloning
to a late point in time where it would have been paid similarly with the
bpf_skb_store_bytes() as well. Advantage of direct write is that the
writes are inlined whereas the helper cannot make any length assumptions
and thus needs to generate a call to memcpy() also for small sizes, as well
as cost of helper call itself with sanity checks are avoided. Plus, when
direct read is already used, we don't need to cache or perform rechecks
on the data boundaries (due to verifier invalidating previous checks for
helpers that change skb->data), so more complex programs using rewrites
can benefit from switching to direct read plus write.
For direct packet access to helpers, we save the otherwise needed copy into
a temp struct sitting on stack memory when use-case allows. Both facilities
are enabled via may_access_direct_pkt_data() in verifier. For now, we limit
this to map helpers and csum_diff, and can successively enable other helpers
where we find it makes sense. Helpers that definitely cannot be allowed for
this are those part of bpf_helper_changes_skb_data() since they can change
underlying data, and those that write into memory as this could happen for
packet typed args when still cloned. bpf_csum_update() helper accommodates
for the fact that we need to fixup checksum_complete when using direct write
instead of bpf_skb_store_bytes(), meaning the programs can use available
helpers like bpf_csum_diff(), and implement csum_add(), csum_sub(),
csum_block_add(), csum_block_sub() equivalents in eBPF together with the
new helper. A usage example will be provided for iproute2's examples/bpf/
directory.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The ioctl name and description on the documentation block don't
match the ioctl being defined. This was probably overlooked while
renaming the ioctls during the sync file destaging. This patch
provides a more accurate description of what the ioctl actually does.
Signed-off-by: Emilio López <emilio.lopez@collabora.co.uk>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20160919042120.6280-1-emilio.lopez@collabora.co.uk
Sample use case of how this is encoded:
user space via tuntap (or a connected VM/Machine/container)
encodes the tcindex TLV.
Sample use case of decoding:
IFE action decodes it and the skb->tc_index is then used to classify.
So something like this for encoded ICMP packets:
.. first decode then reclassify... skb->tcindex will be set
sudo $TC filter add dev $ETH parent ffff: prio 2 protocol 0xbeef \
u32 match u32 0 0 flowid 1:1 \
action ife decode reclassify
...next match the decode icmp packet...
sudo $TC filter add dev $ETH parent ffff: prio 4 protocol ip \
u32 match ip protocol 1 0xff flowid 1:1 \
action continue
... last classify it using the tcindex classifier and do someaction..
sudo $TC filter add dev $ETH parent ffff: prio 5 protocol ip \
handle 0x11 tcindex classid 1:1 \
action blah..
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adding others generic flags, which could be used by many
components like GS1662.
Signed-off-by: Charles-Antoine Couret <charles-antoine.couret@nexvision.fr>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
In a typical IPvlan L3 setup where master is in default-ns and
each slave is into different (slave) ns. In this setup egress
packet processing for traffic originating from slave-ns will
hit all NF_HOOKs in slave-ns as well as default-ns. However same
is not true for ingress processing. All these NF_HOOKs are
hit only in the slave-ns skipping them in the default-ns.
IPvlan in L3 mode is restrictive and if admins want to deploy
iptables rules in default-ns, this asymmetric data path makes it
impossible to do so.
This patch makes use of the l3_rcv() (added as part of l3mdev
enhancements) to perform input route lookup on RX packets without
changing the skb->dev and then uses nf_hook at NF_INET_LOCAL_IN
to change the skb->dev just before handing over skb to L4.
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
CC: David Ahern <dsa@cumulusnetworks.com>
Reviewed-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a nested attribute of offload stats to if_stats_msg
named IFLA_STATS_LINK_OFFLOAD_XSTATS.
Under it, add SW stats, meaning stats only per packets that went via
slowpath to the cpu, named IFLA_OFFLOAD_XSTATS_CPU_HIT.
Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Similar to gre, vxlan, geneve tunnels allow IPIP tunnels to
operate in 'collect metadata' mode.
bpf_skb_[gs]et_tunnel_key() helpers can make use of it right away.
ovs can use it as well in the future (once appropriate ovs-vport
abstractions and user apis are added).
Note that just like in other tunnels we cannot cache the dst,
since tunnel_info metadata can be different for every packet.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch allows flock, posix locks, ofd locks and leases to work
correctly on overlayfs.
Instead of using the underlying inode for storing lock context use the
overlay inode. This allows locks to be persistent across copy-up.
This is done by introducing locks_inode() helper and using it instead of
file_inode() to get the inode in locking code. For non-overlayfs the two
are equivalent, except for an extra pointer dereference in locks_inode().
Since lock operations are in "struct file_operations" we must also make
sure not to call underlying filesystem's lock operations. Introcude a
super block flag MS_NOREMOTELOCK to this effect.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Acked-by: Jeff Layton <jlayton@poochiereds.net>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Specify the format (size and endianess) for the vlan attributes.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add the definitions for src/dst udp/tcp port masks and use
them when setting && dumping the relevant keys.
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Paul Blakey <paulb@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This action is intended to be an upgrade from a usability perspective
from pedit (as well as operational debugability).
Compare this:
sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action pedit munge offset -14 u8 set 0x02 \
munge offset -13 u8 set 0x15 \
munge offset -12 u8 set 0x15 \
munge offset -11 u8 set 0x15 \
munge offset -10 u16 set 0x1515 \
pipe
to:
sudo tc filter add dev $ETH parent 1: protocol ip prio 10 \
u32 match ip protocol 1 0xff flowid 1:2 \
action skbmod dmac 02:15:15:15:15:15
Also try to do a MAC address swap with pedit or worse
try to debug a policy with destination mac, source mac and
etherype. Then make few rules out of those and you'll get my point.
In the future common use cases on pedit can be migrated to this action
(as an example different fields in ip v4/6, transports like tcp/udp/sctp
etc). For this first cut, this allows modifying basic ethernet header.
The most important ethernet use case at the moment is when redirecting or
mirroring packets to a remote machine. The dst mac address needs a re-write
so that it doesnt get dropped or confuse an interconnecting (learning) switch
or dropped by a target machine (which looks at the dst mac). And at times
when flipping back the packet a swap of the MAC addresses is needed.
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This time around we have 92 non-merge commits. Most
of the changes are in drivers/usb/gadget (40.3%)
with drivers/usb/gadget/function being the most
active directory (27.2%).
As for UDC drivers, only dwc3 (26.5%) and dwc2
(12.7%) have really been active.
The most important changes for dwc3 are better
support for scatterlist and, again, throughput
improvements. While on dwc2 got some minor stability
fixes related to soft reset and FIFO usage.
Felipe Tonello has done some good work fixing up our
f_midi gadget and Tal Shorer has implemented a nice
API change for our ULPI bus.
Apart from these, we have our usual set of
non-critical fixes, spelling fixes, build warning
fixes, etc.
-----BEGIN PGP SIGNATURE-----
iQI6BAABCAAkBQJX2TpXHRxmZWxpcGUuYmFsYmlAbGludXguaW50ZWwuY29tAAoJ
EMy+uJnhGpkGxX0QAIOavB96wkAP4msMzCMIKyKX8NBVWEYzLy7Ou6IrPKiGOR28
CjDi1C5qW7838H4neA6Gfw896rfTiAODhoiOY/RTXI7p2hTUUXHQuJ81Bad75gHD
744BUMPy37YJnvgHTasYn0GxAvP73YmV+omRxo76poetYZ9eH8dGECvC9q6m+jRU
XaubWEq1JMvzHvlyO7BIrndGY4ByRbBoG0XPiZF07e5YDkKWQmv56tgAAN7fEkeh
8HIg8lG2xvgf+w6cDbrQ2c8fp055OvrOq40R2pSXwQgYYKXPJ+vFiNzriQ6Rfxai
gIYrB+mrKZcY6mi6OhoulGfNxT65VqMqnUfwVbbwlJQbDe5EkV6o/1WYdaBvdO2s
qTT9A5alabFzbQ8ZtjzsIHtV62LwmZlMWk7gxZlcvLFNjf/P2CMqqnJi30/JlrsE
iqhwIGRDhMq4QZZbiiEiJEaEn6vh2zseRdmCy3uMFearXKBP/I2177QOTDG7ZMKf
fZR4ROlv6c5tIpBCOsTV0+7c/fnnnOTHU4+vJiUzU0krkPzaLcL8iMT1tn+uGchX
4d2XLuT6AbVxQR4N8YF4FwRzB/PbEb+ZWWGu1mOVSd9/dsA43K50zNdc061dgz8K
q8lau6bmtfUXdbeWa3WMEaAZIuSBmFarJY0tPZV6W7cXUAgKitThRD6fp4E0
=vTFa
-----END PGP SIGNATURE-----
Merge tag 'usb-for-v4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb into usb-next
Felipe writes:
usb: patches for v4.9 merge window
This time around we have 92 non-merge commits. Most
of the changes are in drivers/usb/gadget (40.3%)
with drivers/usb/gadget/function being the most
active directory (27.2%).
As for UDC drivers, only dwc3 (26.5%) and dwc2
(12.7%) have really been active.
The most important changes for dwc3 are better
support for scatterlist and, again, throughput
improvements. While on dwc2 got some minor stability
fixes related to soft reset and FIFO usage.
Felipe Tonello has done some good work fixing up our
f_midi gadget and Tal Shorer has implemented a nice
API change for our ULPI bus.
Apart from these, we have our usual set of
non-critical fixes, spelling fixes, build warning
fixes, etc.
These counters sit in hot path and do show up in perf, this is especially
true for 'found' and 'searched' which get incremented for every packet
processed.
Information like
searched=212030105
new=623431
found=333613
delete=623327
does not seem too helpful nowadays:
- on busy systems found and searched will overflow every few hours
(these are 32bit integers), other more busy ones every few days.
- for debugging there are better methods, such as iptables' trace target,
the conntrack log sysctls. Nowadays we also have perf tool.
This removes packet path stat counters except those that
are expected to be 0 (or close to 0) on a normal system, e.g.
'insert_failed' (race happened) or 'invalid' (proto tracker rejects).
The insert stat is retained for the ctnetlink case.
The found stat is retained for the tuple-is-taken check when NAT has to
determine if it needs to pick a different source address.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The dynset expression matches if we can fit a new entry into the set.
If there is no room for it, then it breaks the rule evaluation.
This patch introduces the inversion flag so you can add rules to
explicitly drop packets that don't fit into the set. For example:
# nft filter input flow table xyz size 4 { ip saddr timeout 120s counter } overflow drop
This is useful to provide a replacement for connlimit.
For the rule above, every new entry uses the IPv4 address as key in the
set, this entry gets a timeout of 120 seconds that gets refresh on every
packet seen. If we get new flow and our set already contains 4 entries
already, then this packet is dropped.
You can already express this in positive logic, assuming default policy
to drop:
# nft filter input flow table xyz size 4 { ip saddr timeout 10s counter } accept
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Add support to pass through an offset to the hash value. With this
feature, the sysadmin is able to generate a hash with a given
offset value.
Example:
meta mark set jhash ip saddr mod 2 seed 0xabcd offset 100
This option generates marks according to the source address from 100 to
101.
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
This action could be used before redirecting packets to a shared tunnel
device, or when redirecting packets arriving from a such a device.
The action will release the metadata created by the tunnel device
(decap), or set the metadata with the specified values for encap
operation.
For example, the following flower filter will forward all ICMP packets
destined to 11.11.11.2 through the shared vxlan device 'vxlan0'. Before
redirecting, a metadata for the vxlan tunnel is created using the
tunnel_key action and it's arguments:
$ tc filter add dev net0 protocol ip parent ffff: \
flower \
ip_proto 1 \
dst_ip 11.11.11.2 \
action tunnel_key set \
src_ip 11.11.0.1 \
dst_ip 11.11.0.2 \
id 11 \
action mirred egress redirect dev vxlan0
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce classifying by metadata extracted by the tunnel device.
Outer header fields - source/dest ip and tunnel id, are extracted from
the metadata when classifying.
For example, the following will add a filter on the ingress Qdisc of shared
vxlan device named 'vxlan0'. To forward packets with outer src ip
11.11.0.2, dst ip 11.11.0.1 and tunnel id 11. The packets will be
forwarded to tap device 'vnet0' (after metadata is released):
$ tc filter add dev vxlan0 protocol ip parent ffff: \
flower \
enc_src_ip 11.11.0.2 \
enc_dst_ip 11.11.0.1 \
enc_key_id 11 \
dst_ip 11.11.11.1 \
action tunnel_key release \
action mirred egress redirect dev vnet0
The action tunnel_key, will be introduced in the next patch in this
series.
Signed-off-by: Amir Vadai <amir@vadai.me>
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The codes will be called:
MEDIA_BUS_FMT_SBGGR16_1X16
MEDIA_BUS_FMT_SGBRG16_1X16
MEDIA_BUS_FMT_SGRBG16_1X16
MEDIA_BUS_FMT_SRGGB16_1X16
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Trivially fix those broken references, by copying the structs
fron the header, just like other API documentation at the
DVB side.
This doesn't have the level of quality used at the V4L2 side
of the API, but, as this documents a deprecated API, used
only by av7110 driver, it doesn't make much sense to invest
time making it better.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Reported-by: Paul Wouters <paul@nohats.ca>
Signed-off-by: Richard Guy Briggs <rgb@tricolour.ca>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
openvswitch: Add support for 8021.AD
Change the description of the VLAN tpid field.
Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This adds the capability for a process that has CAP_NET_ADMIN on
a socket to see the socket mark in socket dumps.
Commit a52e95abf7 ("net: diag: allow socket bytecode filters to
match socket marks") recently gave privileged processes the
ability to filter socket dumps based on mark. This patch is
complementary: it ensures that the mark is also passed to
userspace in the socket's netlink attributes. It is useful for
tools like ss which display information about sockets.
Tested: https://android-review.googlesource.com/270210
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The _until_ attribute is renamed to _modulus_ as the behaviour is similar to
other expresions with number limits (ex. nft_hash).
Renaming is possible because there isn't a kernel release yet with these
changes.
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
There are already some GRE_* macros in kernel, so it is unnecessary
to define these macros. And remove some useless macros
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for your net-next
tree. Most relevant updates are the removal of per-conntrack timers to
use a workqueue/garbage collection approach instead from Florian
Westphal, the hash and numgen expression for nf_tables from Laura
Garcia, updates on nf_tables hash set to honor the NLM_F_EXCL flag,
removal of ip_conntrack sysctl and many other incremental updates on our
Netfilter codebase.
More specifically, they are:
1) Retrieve only 4 bytes to fetch ports in case of non-linear skb
transport area in dccp, sctp, tcp, udp and udplite protocol
conntrackers, from Gao Feng.
2) Missing whitespace on error message in physdev match, from Hangbin Liu.
3) Skip redundant IPv4 checksum calculation in nf_dup_ipv4, from Liping Zhang.
4) Add nf_ct_expires() helper function and use it, from Florian Westphal.
5) Replace opencoded nf_ct_kill() call in IPVS conntrack support, also
from Florian.
6) Rename nf_tables set implementation to nft_set_{name}.c
7) Introduce the hash expression to allow arbitrary hashing of selector
concatenations, from Laura Garcia Liebana.
8) Remove ip_conntrack sysctl backward compatibility code, this code has
been around for long time already, and we have two interfaces to do
this already: nf_conntrack sysctl and ctnetlink.
9) Use nf_conntrack_get_ht() helper function whenever possible, instead
of opencoding fetch of hashtable pointer and size, patch from Liping Zhang.
10) Add quota expression for nf_tables.
11) Add number generator expression for nf_tables, this supports
incremental and random generators that can be combined with maps,
very useful for load balancing purpose, again from Laura Garcia Liebana.
12) Fix a typo in a debug message in FTP conntrack helper, from Colin Ian King.
13) Introduce a nft_chain_parse_hook() helper function to parse chain hook
configuration, this is used by a follow up patch to perform better chain
update validation.
14) Add rhashtable_lookup_get_insert_key() to rhashtable and use it from the
nft_set_hash implementation to honor the NLM_F_EXCL flag.
15) Missing nulls check in nf_conntrack from nf_conntrack_tuple_taken(),
patch from Florian Westphal.
16) Don't use the DYING bit to know if the conntrack event has been already
delivered, instead a state variable to track event re-delivery
states, also from Florian.
17) Remove the per-conntrack timer, use the workqueue approach that was
discussed during the NFWS, from Florian Westphal.
18) Use the netlink conntrack table dump path to kill stale entries,
again from Florian.
19) Add a garbage collector to get rid of stale conntracks, from
Florian.
20) Reschedule garbage collector if eviction rate is high.
21) Get rid of the __nf_ct_kill_acct() helper.
22) Use ARPHRD_ETHER instead of hardcoded 1 from ARP logger.
23) Make nf_log_set() interface assertive on unsupported families.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce BPF_PROG_TYPE_PERF_EVENT programs that can be attached to
HW and SW perf events (PERF_TYPE_HARDWARE and PERF_TYPE_SOFTWARE
correspondingly in uapi/linux/perf_event.h)
The program visible context meta structure is
struct bpf_perf_event_data {
struct pt_regs regs;
__u64 sample_period;
};
which is accessible directly from the program:
int bpf_prog(struct bpf_perf_event_data *ctx)
{
... ctx->sample_period ...
... ctx->regs.ip ...
}
The bpf verifier rewrites the accesses into kernel internal
struct bpf_perf_event_data_kern which allows changing
struct perf_sample_data without affecting bpf programs.
New fields can be added to the end of struct bpf_perf_event_data
in the future.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a per-port flag to control the unknown multicast flood, similar to the
unknown unicast flood flag and break a few long lines in the netlink flag
exports.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch enhances ethtool link mode bitmap to include
missing interface modes for 1G/10G speeds
Changes:
1000baseX is the mode introduced to cover all 1G Fiber cases.
All modes under 1000BaseX i.e. 1000BASE-SX, 1000BASE-LX, 1000BASE-LX10
and 1000BASE-BX10 are not explicitly defined at this moment.
10G CR,SR,LR and ER link modes are included for 10G speed..
Issue:
ethtool on 1G/10G SFP port reports Base-T
as this port supports 1000baseX,10G CR, SR and LR modes.
root@tor-02$ ethtool swp1
Settings for swp1:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseT/Full
10000baseT/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 0
Transceiver: external
Auto-negotiation: off
Current message level: 0x00000000 (0)
Link detected: yes
After fix:
root@tor-02$ ethtool swp1
Settings for swp1:
Supported ports: [ FIBRE ]
Supported link modes: 1000baseX/Full
10000baseCR/Full
10000baseSR/Full
10000baseLR/Full
10000baseER/Full
Supported pause frame use: Symmetric Receive-only
Supports auto-negotiation: Yes
Advertised link modes: 1000baseT/Full
Advertised pause frame use: No
Advertised auto-negotiation: No
Speed: 10000Mb/s
Duplex: Full
Port: FIBRE
PHYAD: 0
Transceiver: external
Auto-negotiation: off
Current message level: 0x00000000 (0)
Link detected: yes
Signed-off-by: Vidya Sagar Ravipati <vidya@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add UDP bearer options to netlink bearer get message. This is used by
the tipc user space tool to display UDP options.
The UDP bearer information is passed using either a sockaddr_in or
sockaddr_in6 structs. This means the user space receiver should
intermediately store the retrieved data in a large enough struct
(sockaddr_strage) before casting to the proper IP version type.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch introduces UDP replicast. A concept where we emulate
multicast by sending multiple unicast messages to configured peers.
The purpose of replicast is mainly to be able to use TIPC in cloud
environments where IP multicast is disabled. Using replicas to unicast
multicast messages is costly as we have to copy each skb and send the
copies individually.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds SNMP counter for drops caused by MD5 mismatches.
The current syslog might help, but a counter is more precise and helps
monitoring.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The PTM Control register (PCIe r3.1, sec 7.32.3) contains an Effective
Granularity field:
This provides information relating to the expected accuracy of the PTM
clock, but does not otherwise affect the PTM mechanism.
Set the Effective Granularity based on the PTM Root and any intervening PTM
Time Sources.
This does not set Effective Granularity for Root Complex Integrated
Endpoints because I don't know how to figure out clock granularity for
them. The spec says:
... system software must set [Effective Granularity] to the value
reported in the Local Clock Granularity field by the associated PTM
Time Source.
but I don't know how to identify the associated PTM Time Source. Normally
it's the upstream bridge, but an integrated endpoint has no upstream
bridge.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Introduces a new FunctionFS descriptor flag named
FUNCTIONFS_CONFIG0_SETUP.
When this flag is enabled, FunctionFS userspace drivers can process
non-standard control requests in configuration 0.
Signed-off-by: Felix Hädicke <felixhaedicke@web.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
Introduces a new FunctionFS descriptor flag named
FUNCTIONFS_ALL_CTRL_RECIP. When this flag is enabled, control requests,
which are not explicitly directed to an interface or endpoint, can be
handled.
This allows FunctionFS userspace drivers to process non-standard
control requests.
Signed-off-by: Felix Hädicke <felixhaedicke@web.de>
Signed-off-by: Felipe Balbi <felipe.balbi@linux.intel.com>
This allows a privileged process to filter by socket mark when
dumping sockets via INET_DIAG_BY_FAMILY. This is useful on
systems that use mark-based routing such as Android.
The ability to filter socket marks requires CAP_NET_ADMIN, which
is consistent with other privileged operations allowed by the
SOCK_DIAG interface such as the ability to destroy sockets and
the ability to inspect BPF filters attached to packet sockets.
Tested: https://android-review.googlesource.com/261350
Signed-off-by: Lorenzo Colitti <lorenzo@google.com>
Acked-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This define is a duplicate of V4L2_YCBCR_ENC_601. So mark it as
'should not be used' and put it under #ifndef __KERNEL__ to
prevent drivers from referring to it.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The default quantization range of the Y'CbCr encodings of sRGB
and AdobeRGB are full range instead of limited range according to
the corresponding standards.
Fix this in the V4L2_MAP_QUANTIZATION_DEFAULT macro.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
In support of enabling resize / truncate of device-dax instances, define
a pseudo-fs to provide a unified inode/address space for vm operations.
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Some touch controllers send out touch data in a similar way to a
greyscale frame grabber.
Add new device type VFL_TYPE_TOUCH:
- This uses a new device prefix v4l-touch for these devices, to stop
generic capture software from treating them as webcams. Otherwise,
touch is treated similarly to video capture.
- Add V4L2_INPUT_TYPE_TOUCH
- Add MEDIA_INTF_T_V4L_TOUCH
- Add V4L2_CAP_TOUCH to indicate device is a touch device
Add formats:
- V4L2_TCH_FMT_DELTA_TD16 for signed 16-bit touch deltas
- V4L2_TCH_FMT_DELTA_TD08 for signed 16-bit touch deltas
- V4L2_TCH_FMT_TU16 for unsigned 16-bit touch data
- V4L2_TCH_FMT_TU08 for unsigned 8-bit touch data
This support will be used by:
- Atmel maXTouch (atmel_mxt_ts)
- Synaptics RMI4.
- sur40
Signed-off-by: Nick Dyer <nick@shmanahar.org>
Tested-by: Chris Healy <cphealy@gmail.com>
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Fixes these compiler warnings via libc-compat.h when glibc netipx/ipx.h is
included before linux/ipx.h:
./linux/ipx.h:9:8: error: redefinition of ‘struct sockaddr_ipx’
./linux/ipx.h:26:8: error: redefinition of ‘struct ipx_route_definition’
./linux/ipx.h:32:8: error: redefinition of ‘struct ipx_interface_definition’
./linux/ipx.h:49:8: error: redefinition of ‘struct ipx_config_data’
./linux/ipx.h:58:8: error: redefinition of ‘struct ipx_route_def’
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kernel uapi header are supposed to use them. Fixes userspace compile error:
linux/openvswitch.h:583:2: error: unknown type name ‘uint32_t’
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes userspace compile error:
error: field ‘real’ has incomplete type
struct timeval real; /* real (wall-clock) time */
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes userspace compiler error:
error: unknown type name ‘uint32_t’
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes userspace compilation errors:
error: field ‘addr’ has incomplete type
struct sockaddr_in addr; /* IP address and port to send to */
error: field ‘addr’ has incomplete type
struct sockaddr_in6 addr; /* IP address and port to send to */
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes userspace compilation errors like:
error: field ‘addr’ has incomplete type
struct sockaddr_in addr; /* IP address and port to send to */
^
error: field ‘addr’ has incomplete type
struct sockaddr_in6 addr; /* IP address and port to send to */
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes userspace compilation errors like:
error: field ‘iph’ has incomplete type
error: field ‘prefix’ has incomplete type
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes userspace compilation error:
error: ‘IFNAMSIZ’ undeclared here (not in a function)
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fixes userspace compiler errors like:
linux/hsi/hsi_char.h:51:2: error: unknown type name ‘uint32_t’
Signed-off-by: Mikko Rapeli <mikko.rapeli@iki.fi>
Signed-off-by: Sebastian Reichel <sre@kernel.org>
This patch adds the numgen expression that allows us to generated
incremental and random numbers, this generator is bound to a upper limit
that is specified by userspace.
This expression is useful to distribute packets in a round-robin fashion
as well as randomly.
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This patch adds the quota expression. This new stateful expression
integrate easily into the dynset expression to build 'hashquota' flow
tables.
Arguably, we could use instead "counter bytes > 1000" instead, but this
approach has several problems:
1) We only support for one single stateful expression in dynamic set
definitions, and the expression above is a composite of two
expressions: get counter + comparison.
2) We would need to restore the packed counter representation (that we
used to have) based on seqlock to synchronize this, since per-cpu is
not suitable for this.
So instead of bloating the counter expression back with the seqlock
representation and extending the existing set infrastructure to make it
more complex for the composite described above, let's follow the more
simple approach of adding a quota expression that we can plug into our
existing infrastructure.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This work adds a bpf_skb_change_tail() helper for tc BPF programs. The
basic idea is to expand or shrink the skb in a controlled manner. The
eBPF program can then rewrite the rest via helpers like bpf_skb_store_bytes(),
bpf_lX_csum_replace() and others rather than passing a raw buffer for
writing here.
bpf_skb_change_tail() is really a slow path helper and intended for
replies with f.e. ICMP control messages. Concept is similar to other
helpers like bpf_skb_change_proto() helper to keep the helper without
protocol specifics and let the BPF program mangle the remaining parts.
A flags field has been added and is reserved for now should we extend
the helper in future.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add TIPC_NL_PEER_REMOVE netlink command. This command can remove
an offline peer node from the internal data structures.
This will be supported by the tipc user space tool in iproute2.
Signed-off-by: Richard Alpe <richard.alpe@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Use one of the vlan xstats padding fields to export the vlan flags. This is
needed in order to be able to distinguish between master (bridge) and port
vlan entries in user-space when dumping the bridge vlan stats.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The current vlan push action supports only vid and protocol options.
Add priority option.
Example script that adds vlan push action with vid and
priority:
tc filter add dev veth0 protocol ip parent ffff: \
flower \
indev veth0 \
action vlan push id 100 priority 5
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enhance flower to support 802.1Q vlan protocol classification.
Currently, the supported fields are vlan_id and vlan_priority.
Example:
# add a flower filter with vlan id and priority classification
tc filter add dev ens4f0 protocol 802.1Q parent ffff: \
flower \
indev ens4f0 \
vlan_ethtype ipv4 \
vlan_id 100 \
vlan_prio 3 \
action vlan pop
Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add an pci_enable_ptm() interface so drivers can enable PTM.
The PCI core enables PTM on PTM Roots and switches automatically, but we
don't enable PTM on endpoints unless a driver requests it.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Minor overlapping changes for both merge conflicts.
Resolution work done by Stephen Rothwell was used
as a reference.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
1) Buffers powersave frame test is reversed in cfg80211, fix from Felix
Fietkau.
2) Remove bogus WARN_ON in openvswitch, from Jarno Rajahalme.
3) Fix some tg3 ethtool logic bugs, and one that would cause no
interrupts to be generated when rx-coalescing is set to 0. From
Satish Baddipadige and Siva Reddy Kallam.
4) QLCNIC mailbox corruption and napi budget handling fix from Manish
Chopra.
5) Fix fib_trie logic when walking the trie during /proc/net/route
output than can access a stale node pointer. From David Forster.
6) Several sctp_diag fixes from Phil Sutter.
7) PAUSE frame handling fixes in mlxsw driver from Ido Schimmel.
8) Checksum fixup fixes in bpf from Daniel Borkmann.
9) Memork leaks in nfnetlink, from Liping Zhang.
10) Use after free in rxrpc, from David Howells.
11) Use after free in new skb_array code of macvtap driver, from Jason
Wang.
12) Calipso resource leak, from Colin Ian King.
13) mediatek bug fixes (missing stats sync init, etc.) from Sean Wang.
14) Fix bpf non-linear packet write helpers, from Daniel Borkmann.
15) Fix lockdep splats in macsec, from Sabrina Dubroca.
16) hv_netvsc bug fixes from Vitaly Kuznetsov, mostly to do with VF
handling.
17) Various tc-action bug fixes, from CONG Wang.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (116 commits)
net_sched: allow flushing tc police actions
net_sched: unify the init logic for act_police
net_sched: convert tcf_exts from list to pointer array
net_sched: move tc offload macros to pkt_cls.h
net_sched: fix a typo in tc_for_each_action()
net_sched: remove an unnecessary list_del()
net_sched: remove the leftover cleanup_a()
mlxsw: spectrum: Allow packets to be trapped from any PG
mlxsw: spectrum: Unmap 802.1Q FID before destroying it
mlxsw: spectrum: Add missing rollbacks in error path
mlxsw: reg: Fix missing op field fill-up
mlxsw: spectrum: Trap loop-backed packets
mlxsw: spectrum: Add missing packet traps
mlxsw: spectrum: Mark port as active before registering it
mlxsw: spectrum: Create PVID vPort before registering netdevice
mlxsw: spectrum: Remove redundant errors from the code
mlxsw: spectrum: Don't return upon error in removal path
i40e: check for and deal with non-contiguous TCs
ixgbe: Re-enable ability to toggle VLAN filtering
ixgbe: Force VLNCTRL.VFE to be set in all VMDq paths
...
supersede our debugfs configuration interface in the long run. It is
especially necessary when batman-adv should be used in different
namespaces, since debugfs can not differentiate between those.
More specifically, the following changes are included:
- Two fixes for namespace handling by Andrew Lunn, checking also the
namespaces for parent interfaces, and supress debugfs entries
for non-default netns
- Implement various netlink commands for the new interface, by
Matthias Schiffer, Andrew Lunn, Sven Eckelmann and Simon Wunderlich
(13 patches):
* routing algorithm list
* hardif list
* translation tables (local and global)
* TTVN for the translation tables
* originator and neighbor tables for B.A.T.M.A.N. IV
and B.A.T.M.A.N. V
* gateway dump functionality for B.A.T.M.A.N. IV
and B.A.T.M.A.N. V
* Bridge Loop Avoidance claims, and corresponding BLA group
* Bridge Loop Avoidance backbone tables
- Finally, mark batman-adv as netns compatible, by Andrew Lunn (1 patch)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdBQJXss7MFhxzd0BzaW1vbnd1bmRlcmxpY2guZGUACgkQoSvjmEKS
nqGvTBAAw7A0lG5ghEEDTVWl++/q3fc41ZPn+XGihizQ3z9Hy5ZAuyREKqMz43RP
MJb2sHnoS/guCY7Y0Mn/ubQDuvp7PmqJxNmHiqdW0UVKrwgRrlhk/uZfd3Blib8J
TR1ktRAT/OKtPIxps2CSq2UX1GcnadtstaUvDDSWnak/0zsQl5GWVYxOkbdsbYUb
qYAbcHBXkdvTfIpZxSwb3QDfKoRs+Hf8hr09V19DH/GZs4puYbIxjw1QhC2TBe0f
SkcMVkmQ6GqJsjRU4BDVCrrfYvv3ncBWXtb5CKyq8il2AvdI1HbXha9hpg0SO69p
fAC5yzyB0rCCr7AKMYBgeIf9u6z5mllKly9QJkZMjtWuIIxt4J5rFK2PN+M3xprb
BWXrINWR4/1C4LA3dDvCL7sFHlObHVKRjSNwzmQ3b6UNY72d6UILG0D9JTI8M+y7
YXtjwCQYNCvjmkprM6mgPMnlk90RdXNhNUngfOe2/2C1li2gaodX7lrx+lBS8/5N
oK5W85vmO41FChLFof5PV6mn4cUV7sKlKPmv93xRvHd89RWBWIU/kGpQQjkCgh5U
44CJiD+FDRkEkDJVo7IkqTxGF39zYR39mQrNFXc6G1H4wRFtqHGP+VOa72a/7arV
FeGtulzeGBK3z1Qi9UyjS2N9mDYSKkfj4f2H+AC1GCRC2mTMCQU=
=KDfF
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-for-davem-20160816' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
pull request for net-next: batman-adv 2016-08-16
This feature patchset is all about adding netlink support, which should
supersede our debugfs configuration interface in the long run. It is
especially necessary when batman-adv should be used in different
namespaces, since debugfs can not differentiate between those.
More specifically, the following changes are included:
- Two fixes for namespace handling by Andrew Lunn, checking also the
namespaces for parent interfaces, and supress debugfs entries
for non-default netns
- Implement various netlink commands for the new interface, by
Matthias Schiffer, Andrew Lunn, Sven Eckelmann and Simon Wunderlich
(13 patches):
* routing algorithm list
* hardif list
* translation tables (local and global)
* TTVN for the translation tables
* originator and neighbor tables for B.A.T.M.A.N. IV
and B.A.T.M.A.N. V
* gateway dump functionality for B.A.T.M.A.N. IV
and B.A.T.M.A.N. V
* Bridge Loop Avoidance claims, and corresponding BLA group
* Bridge Loop Avoidance backbone tables
- Finally, mark batman-adv as netns compatible, by Andrew Lunn (1 patch)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This driver is responsible for implementing ISH HID client, which
gets HID description and report. Once it has completely gets
report descriptors, it registers as a HID LL drivers. This implements
necessary callbacks so that it can be used by HID sensor hub driver.
Original-author: Daniel Drubin <daniel.drubin@intel.com>
Reviewed-and-tested-by: Ooi, Joyce <joyce.ooi@intel.com>
Tested-by: Grant Likely <grant.likely@secretlab.ca>
Tested-by: Rann Bar-On <rb6@duke.edu>
Tested-by: Atri Bhattacharya <badshah400@aim.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Add Precision Time Measurement (PTM) support (see PCIe r3.1, sec 6.22).
Enable PTM on PTM Root devices and switch ports. This does not enable PTM
on endpoints.
There currently are no PTM-capable devices on the market, but it is
expected to be supported by the Intel Apollo Lake platform.
[bhelgaas: complete rework]
Signed-off-by: Jonathan Yong <jonathan.yong@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
"NVDIMM DSM Interface Example" v1.2 made an incompatible change to the
layout of function1 "SMART and Health Info". While the kernel does not
directly consume this payload, it does define it in ndctl.h that
userpace utilities consume.
Reported-by: Brian Boylston <brian.boylston@hpe.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
1. Use struct gre_base_hdr directly in pptp_gre_header instead of
duplicated members;
2. Use existing macros like GRE_KEY, GRE_SEQ, and so on instead of
duplicated macros defined by PPTP;
3. Add new macros like GRE_IS_ACK/SEQ and so on instead of
PPTP_GRE_IS_A/S and so on;
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Reviewed-by: Philip Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While hashing out BPF's current_task_under_cgroup helper bits, it came
to discussion that the skb_in_cgroup helper name was suboptimally chosen.
Tejun says:
So, I think in_cgroup should mean that the object is in that
particular cgroup while under_cgroup in the subhierarchy of that
cgroup. Let's rename the other subhierarchy test to under too. I
think that'd be a lot less confusing going forward.
[...]
It's more intuitive and gives us the room to implement the real
"in" test if ever necessary in the future.
Since this touches uapi bits, we need to change this as long as v4.8
is not yet officially released. Thus, change the helper enum and rename
related bits.
Fixes: 4a482f34af ("cgroup: bpf: Add bpf_skb_in_cgroup_proto")
Reference: http://patchwork.ozlabs.org/patch/658500/
Suggested-by: Sargun Dhillon <sargun@sargun.me>
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
This adds a bpf helper that's similar to the skb_in_cgroup helper to check
whether the probe is currently executing in the context of a specific
subset of the cgroupsv2 hierarchy. It does this based on membership test
for a cgroup arraymap. It is invalid to call this in an interrupt, and
it'll return an error. The helper is primarily to be used in debugging
activities for containers, where you may have multiple programs running in
a given top-level "container".
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds mask for the Control register
10Mbps speed.
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Kedareswara rao Appana <appanad@xilinx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a new hash expression, this provides jhash support but
this can be extended to support for other hash functions. The modulus
and seed already comes embedded into this new expression.
Use case example:
... meta mark set hash ip saddr mod 10
Signed-off-by: Laura Garcia Liebana <nevola@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The PPTP is encapsulated by GRE header with that GRE_VERSION bits
must contain one. But current GRE RPS needs the GRE_VERSION must be
zero. So RPS does not work for PPTP traffic.
In my test environment, there are four MIPS cores, and all traffic
are passed through by PPTP. As a result, only one core is 100% busy
while other three cores are very idle. After this patch, the usage
of four cores are balanced well.
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Reviewed-by: Philip Prindeville <philipp@redfish-solutions.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for your net tree,
they are:
1) Use mod_timer_pending() to avoid reactivating a dead expectation in
the h323 conntrack helper, from Liping Zhang.
2) Oneliner to fix a type in the register name defined in the nf_tables
header.
3) Don't try to look further when we find an inactive elements with no
descendants in the rbtree set implementation, otherwise we crash.
4) Handle valid zero CSeq in the SIP conntrack helper, from
Christophe Leroy.
5) Don't display a trailing slash in conntrack helper with no classes
via /proc/net/nf_conntrack_expect, from Liping Zhang.
6) Fix an expectation leak during creation from the nfqueue path, again
from Liping Zhang.
7) Validate netlink port ID in verdict message from nfqueue, otherwise
an injection can be possible. Again from Zhang.
8) Reject conntrack tuples with different transport protocol on
original and reply tuples, also from Zhang.
9) Validate offset and length in nft_exthdr, make sure they are under
sizeof(u8), from Laura Garcia Liebana.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Dump the list of bridge loop avoidance backbones via the netlink socket.
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Dump the list of bridge loop avoidance claims via the netlink socket.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: add policy for attributes, fix includes, fix
soft_iface reference leak]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
[sw@simonwunderlich.de: fix kerneldoc, fix error reporting]
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Add BATADV_CMD_GET_GATEWAYS commands, using handlers bat_gw_dump in
batadv_algo_ops. Will always return -EOPNOTSUPP for now, as no
implementations exist yet.
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Dump the algo V originators and neighbours.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven@narfation.org: Fix includes, fix algo_ops integration]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Add BATADV_CMD_GET_ORIGINATORS and BATADV_CMD_GET_NEIGHBORS commands,
using handlers bat_orig_dump and bat_neigh_dump in batadv_algo_ops. Will
always return -EOPNOTSUPP for now, as no implementations exist yet.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven@narfation.org: Rewrite based on new algo_ops structures]
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
This adds the commands BATADV_CMD_GET_TRANSTABLE_LOCAL and
BATADV_CMD_GET_TRANSTABLE_GLOBAL, which correspond to the transtable_local
and transtable_global debugfs files.
The batadv_tt_client_flags enum is moved to the UAPI to expose it as part
of the netlink API.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: add policy for attributes, fix includes]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
[sw@simonwunderlich.de: fix VID attributes content]
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
BATADV_CMD_GET_HARDIFS will return the list of hardifs (including index,
name and MAC address) of all hardifs for a given softif.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: Reduce the number of changes to
BATADV_CMD_GET_HARDIFS, add policy for attributes]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
BATADV_CMD_GET_ROUTING_ALGOS is used to get the list of supported routing
algorithms.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: Reduce the number of changes to
BATADV_CMD_GET_ROUTING_ALGOS, fix includes]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
This is required to correctly interpret INET_DIAG_INFO messages exported
by sctp_diag module.
Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
- New vsock device support in host and guest
- Platform IOMMU support in host and guest,
including compatibility quirks for legacy systems.
- Misc fixes and cleanups.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXofvbAAoJECgfDbjSjVRpUTIH/iEoK9h636tBayXy0PXkPby0
6fMaRFy6H1HgEttgDhJE8Pqg/ba3qaW9Em0fHyFq7Mp2waFHAZ8hAT8phC6TAK3c
CIBnfzyyuI8u3N9SnNOfelPVcwCBfuALuuTsXB/rwKbYQEVv+U5Rdt3Vyx9+lXkj
P005klz7PfqxFhQrrnj4Eh7VawtHwmMuLH8YoWpCZpM71dHPo6eL+3ftKwhH2boo
qK86uVprwba03Pewpm13vQnotemfVfUUkjXd4EJpG3dx7E0KZosuj0ZG9OV8mPGQ
Cl2gBdUhocdJgeUnAHmf6tumYi9KFlYfy6xLy44YMmN7FL3E9nQjaKZp25UKfiM=
=ztIm
-----END PGP SIGNATURE-----
Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost updates from Michael Tsirkin:
- new vsock device support in host and guest
- platform IOMMU support in host and guest, including compatibility
quirks for legacy systems.
- misc fixes and cleanups.
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
VSOCK: Use kvfree()
vhost: split out vringh Kconfig
vhost: detect 32 bit integer wrap around
vhost: new device IOTLB API
vhost: drop vringh dependency
vhost: convert pre sorted vhost memory array to interval tree
vhost: introduce vhost memory accessors
VSOCK: Add Makefile and Kconfig
VSOCK: Introduce vhost_vsock.ko
VSOCK: Introduce virtio_transport.ko
VSOCK: Introduce virtio_vsock_common.ko
VSOCK: defer sock removal to transports
VSOCK: transport-specific vsock_transport functions
vhost: drop vringh dependency
vop: pull in vhost Kconfig
virtio: new feature to detect IOMMU device quirk
balloon: check the number of available pages in leak balloon
vhost: lockless enqueuing
vhost: simplify work flushing
* x86 nested virt tweak and OOPS fix
* Simplify pvclock code (vdso bits acked by Andy Lutomirski).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJXpKOnAAoJEL/70l94x66D5z4H/R2660Vy3brQrI8lGxCtkXJt
AVe8PwI8nDfYJ/UkMZ2KcHPSvy+sHW2ydaZXYNqXHVBeTaUxiPW9rTgK61ebypGL
1tPOgJ3kGZF6XEdAz6gS8LniNFc+D3W6Y6sRylkEsqPj39/hxe7QMoOMSCQ9imbW
WMIx7/81i1EMw6oi+9FVtq+yHCpvyfFnD8t1TDsYWOReVn1J15SxbEs4Ih+hBMLz
HZ5DEjp9cAmzeR7GLje5eH1t6TEEoNb1MNgFWuscoAsDf8D9DKqRB9s0hC+TLFYn
oZbGSqjQwu3/VMblgedinH6X9MTm8V0zW29ToGnDcoO00AUmdlNmXSaZUhvT/Rs=
=H5cD
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Paolo Bonzini:
- ARM bugfix and MSI injection support
- x86 nested virt tweak and OOPS fix
- Simplify pvclock code (vdso bits acked by Andy Lutomirski).
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
nvmx: mark ept single context invalidation as supported
nvmx: remove comment about missing nested vpid support
KVM: lapic: fix access preemption timer stuff even if kernel_irqchip=off
KVM: documentation: fix KVM_CAP_X2APIC_API information
x86: vdso: use __pvclock_read_cycles
pvclock: introduce seqcount-like API
arm64: KVM: Set cpsr before spsr on fault injection
KVM: arm: vgic-irqfd: Workaround changing kvm_set_routing_entry prototype
KVM: arm/arm64: Enable MSI routing
KVM: arm/arm64: Enable irqchip routing
KVM: Move kvm_setup_default/empty_irq_routing declaration in arch specific header
KVM: irqchip: Convey devid to kvm_set_msi
KVM: Add devid in kvm_kernel_irq_routing_entry
KVM: api: Pass the devid in the msi routing entry
Fixes:
- Fix early access to cpu_spec relocation from Benjamin Herrenschmidt
- Fix incorrect event codes in power9-event-list from Madhavan Srinivasan
- Move register_process_table() out of ppc_md from Michael Ellerman
Use jump_label for [cpu|mmu]_has_feature() from Aneesh Kumar K.V, Kevin Hao and Michael Ellerman:
- Add mmu_early_init_devtree() from Michael Ellerman
- Move disable_radix handling into mmu_early_init_devtree() from Michael Ellerman
- Do hash device tree scanning earlier from Michael Ellerman
- Do radix device tree scanning earlier from Michael Ellerman
- Do feature patching before MMU init from Michael Ellerman
- Check features don't change after patching from Michael Ellerman
- Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V
- Convert mmu_has_feature() to returning bool from Michael Ellerman
- Convert cpu_has_feature() to returning bool from Michael Ellerman
- Define radix_enabled() in one place & use static inline from Michael Ellerman
- Add early_[cpu|mmu]_has_feature() from Michael Ellerman
- Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar K.V
- jump_label: Make it possible for arches to invoke jump_label_init() earlier from Kevin Hao
- Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V
- Remove mfvtb() from Kevin Hao
- Move cpu_has_feature() to a separate file from Kevin Hao
- Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael Ellerman
- Add option to use jump label for cpu_has_feature() from Kevin Hao
- Add option to use jump label for mmu_has_feature() from Kevin Hao
- Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh Kumar K.V
- Annotate jump label assembly from Michael Ellerman
TLB flush enhancements from Aneesh Kumar K.V:
- radix: Implement tlb mmu gather flush efficiently
- Add helper for finding SLBE LLP encoding
- Use hugetlb flush functions
- Drop multiple definition of mm_is_core_local
- radix: Add tlb flush of THP ptes
- radix: Rename function and drop unused arg
- radix/hugetlb: Add helper for finding page size
- hugetlb: Add flush_hugetlb_tlb_range
- remove flush_tlb_page_nohash
Add new ptrace regsets from Anshuman Khandual and Simon Guo:
- elf: Add powerpc specific core note sections
- Add the function flush_tmregs_to_thread
- Enable in transaction NT_PRFPREG ptrace requests
- Enable in transaction NT_PPC_VMX ptrace requests
- Enable in transaction NT_PPC_VSX ptrace requests
- Adapt gpr32_get, gpr32_set functions for transaction
- Enable support for NT_PPC_CGPR
- Enable support for NT_PPC_CFPR
- Enable support for NT_PPC_CVMX
- Enable support for NT_PPC_CVSX
- Enable support for TM SPR state
- Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
- Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
- Enable support for EBB registers
- Enable support for Performance Monitor registers
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXpGaLAAoJEFHr6jzI4aWA9aYP/1AqmRPJ9D0XVUJWT+FVABUK
LESESoVFF4Hug1j1F8Synhg5o4SzD2t45iGKbclYaFthOIyovMg7Wr1KSu4hQ0go
rPuQfpXDNQ8jKdDX8hbPXKUxrNRBNfqJGFo5E7mO6wN9AJ9d1LVwQ+jKAva29Tqs
LaAlMbQNbeObPNzOl73B73iew3aozr+mXjBqv82lqvgYknBD2CLf24xGG3eNIbq5
ZZk4LPC8pdkaxnajnzRFzqwiyPWzao0yfpVRKh52TKHBQF/prR/KACb6zUuja/61
krOfegUKob14OYrehjs6X8XNRLnILRI0u1H5bmj7eVEiY/usyNzE93SMHZM3Wdau
sQF/Au4OLNXj0ZQdNBtzRsZRyp1d560Gsj+lQGBoPd4hfIWkFYHvxzxsUSdqv4uA
MWDMwN0Vvfk0cpprsabsWNevkaotYYBU00px5hF/e5ZUc9/x/xYUVMgPEDr0QZLr
cHJo9/Pjk4u/0g4lj+2y1LLl/0tNEZZg69O6bvffPAPVSS4/P4y/bKKYd4I0zL99
Ykp91mSmkl70F3edgOSFqyda2gN2l2Ekb/i081YGXheFy1rbD29Vxv82BOVog4KY
ibvOqp38WDzCVk5OXuCRvBl0VudLKGJYdppU1nXg4KgrTZXHeCAC0E+NzUsgOF4k
OMvQ+5drVxrno+Hw8FVJ
=0Q8E
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull more powerpc updates from Michael Ellerman:
"These were delayed for various reasons, so I let them sit in next a
bit longer, rather than including them in my first pull request.
Fixes:
- Fix early access to cpu_spec relocation from Benjamin Herrenschmidt
- Fix incorrect event codes in power9-event-list from Madhavan Srinivasan
- Move register_process_table() out of ppc_md from Michael Ellerman
Use jump_label use for [cpu|mmu]_has_feature():
- Add mmu_early_init_devtree() from Michael Ellerman
- Move disable_radix handling into mmu_early_init_devtree() from Michael Ellerman
- Do hash device tree scanning earlier from Michael Ellerman
- Do radix device tree scanning earlier from Michael Ellerman
- Do feature patching before MMU init from Michael Ellerman
- Check features don't change after patching from Michael Ellerman
- Make MMU_FTR_RADIX a MMU family feature from Aneesh Kumar K.V
- Convert mmu_has_feature() to returning bool from Michael Ellerman
- Convert cpu_has_feature() to returning bool from Michael Ellerman
- Define radix_enabled() in one place & use static inline from Michael Ellerman
- Add early_[cpu|mmu]_has_feature() from Michael Ellerman
- Convert early cpu/mmu feature check to use the new helpers from Aneesh Kumar K.V
- jump_label: Make it possible for arches to invoke jump_label_init() earlier from Kevin Hao
- Call jump_label_init() in apply_feature_fixups() from Aneesh Kumar K.V
- Remove mfvtb() from Kevin Hao
- Move cpu_has_feature() to a separate file from Kevin Hao
- Add kconfig option to use jump labels for cpu/mmu_has_feature() from Michael Ellerman
- Add option to use jump label for cpu_has_feature() from Kevin Hao
- Add option to use jump label for mmu_has_feature() from Kevin Hao
- Catch usage of cpu/mmu_has_feature() before jump label init from Aneesh Kumar K.V
- Annotate jump label assembly from Michael Ellerman
TLB flush enhancements from Aneesh Kumar K.V:
- radix: Implement tlb mmu gather flush efficiently
- Add helper for finding SLBE LLP encoding
- Use hugetlb flush functions
- Drop multiple definition of mm_is_core_local
- radix: Add tlb flush of THP ptes
- radix: Rename function and drop unused arg
- radix/hugetlb: Add helper for finding page size
- hugetlb: Add flush_hugetlb_tlb_range
- remove flush_tlb_page_nohash
Add new ptrace regsets from Anshuman Khandual and Simon Guo:
- elf: Add powerpc specific core note sections
- Add the function flush_tmregs_to_thread
- Enable in transaction NT_PRFPREG ptrace requests
- Enable in transaction NT_PPC_VMX ptrace requests
- Enable in transaction NT_PPC_VSX ptrace requests
- Adapt gpr32_get, gpr32_set functions for transaction
- Enable support for NT_PPC_CGPR
- Enable support for NT_PPC_CFPR
- Enable support for NT_PPC_CVMX
- Enable support for NT_PPC_CVSX
- Enable support for TM SPR state
- Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
- Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
- Enable support for EBB registers
- Enable support for Performance Monitor registers"
* tag 'powerpc-4.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (48 commits)
powerpc/mm: Move register_process_table() out of ppc_md
powerpc/perf: Fix incorrect event codes in power9-event-list
powerpc/32: Fix early access to cpu_spec relocation
powerpc/ptrace: Enable support for Performance Monitor registers
powerpc/ptrace: Enable support for EBB registers
powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR
powerpc/ptrace: Enable NT_PPC_TM_CTAR, NT_PPC_TM_CPPR, NT_PPC_TM_CDSCR
powerpc/ptrace: Enable support for TM SPR state
powerpc/ptrace: Enable support for NT_PPC_CVSX
powerpc/ptrace: Enable support for NT_PPC_CVMX
powerpc/ptrace: Enable support for NT_PPC_CFPR
powerpc/ptrace: Enable support for NT_PPC_CGPR
powerpc/ptrace: Adapt gpr32_get, gpr32_set functions for transaction
powerpc/ptrace: Enable in transaction NT_PPC_VSX ptrace requests
powerpc/ptrace: Enable in transaction NT_PPC_VMX ptrace requests
powerpc/ptrace: Enable in transaction NT_PRFPREG ptrace requests
powerpc/process: Add the function flush_tmregs_to_thread
elf: Add powerpc specific core note sections
powerpc/mm: remove flush_tlb_page_nohash
powerpc/mm/hugetlb: Add flush_hugetlb_tlb_range
...
Pull more btrfs updates from Chris Mason:
"This is part two of my btrfs pull, which is some cleanups and a batch
of fixes.
Most of the code here is from Jeff Mahoney, making the pointers we
pass around internally more consistent and less confusing overall. I
noticed a small problem right before I sent this out yesterday, so I
fixed it up and re-tested overnight"
* 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: (40 commits)
Btrfs: fix __MAX_CSUM_ITEMS
btrfs: btrfs_abort_transaction, drop root parameter
btrfs: add btrfs_trans_handle->fs_info pointer
btrfs: btrfs_relocate_chunk pass extent_root to btrfs_end_transaction
btrfs: convert nodesize macros to static inlines
btrfs: introduce BTRFS_MAX_ITEM_SIZE
btrfs: cleanup, remove prototype for btrfs_find_root_ref
btrfs: copy_to_sk drop unused root parameter
btrfs: simpilify btrfs_subvol_inherit_props
btrfs: tests, use BTRFS_FS_STATE_DUMMY_FS_INFO instead of dummy root
btrfs: tests, require fs_info for root
btrfs: tests, move initialization into tests/
btrfs: btrfs_test_opt and friends should take a btrfs_fs_info
btrfs: prefix fsid to all trace events
btrfs: plumb fs_info into btrfs_work
btrfs: remove obsolete part of comment in statfs
btrfs: hide test-only member under ifdef
btrfs: Ratelimit "no csum found" info message
btrfs: Add ratelimit to btrfs printing
Btrfs: fix unexpected balance crash due to BUG_ON
...
Only interesting thing here is Jessica's patch to add ro_after_init support
to modules. The rest are all trivia.
Cheers,
Rusty.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXopD9AAoJENkgDmzRrbjxDVEP+waK+E3Y+vJHibLwwCYcVqLG
OAkQoFXGqxYAo0faGtGPZczxDH/GVK754y+qugOeQvCgHJqit7qWmIUs5uRgqUMb
uKjoUOfCBiVGUsaHfw7RisOP5FXvAk1jkFxBVtywPj6eIonLr9BB4VE813iXnYGG
RkVFvAmFxMgq2BY+yjp4IDCGNVEFBq9UrXZ8XY+WGhI1pbxVp9SCUVrLckARDSS4
t5NeVeLCFlNKmw+ElU7zCKaa4Cyloq9lGFBA1ZgchGADRsOrha9VHNRVxR0pHSIG
100SW+nFhncNWqVQ2YgspVe1so993wGnORPpsb+o3dg7mIn2wkj6WhTfAKv/UQ1W
7JUFaRi/rMC8h/njLKvbX+gmEU1d4nnTyZ76UFh+VxU6mbVWYqI44DCLpt+mPT13
JwwqGGCDPnB/28KFmQITYAkdmvAV3u2aZLXJAQPxKVF7/IzklxHHz2ifMEwtPzOh
UvuWhjmmPAqncKWXzflxMj8i4C3sPyAs0RDSrMXG7jZJlhguVea+b8bXNhEafR+n
GM0btAfGw+VWluyNMlOpigSpJt/n6/hQtzlgBQGn7CeknNwamBe5MLGSN3N9MgL9
WXma9sKn34IqjxtSSP5rJlwTRWHELUZIsKmOnWP4/3gwf1+Fe65ML2cCwp6saeMX
ZjEosYxdKo32LiZhRDPR
=URwe
-----END PGP SIGNATURE-----
Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module updates from Rusty Russell:
"The only interesting thing here is Jessica's patch to add
ro_after_init support to modules. The rest are all trivia"
* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
extable.h: add stddef.h so "NULL" definition is not implicit
modules: add ro_after_init support
jump_label: disable preemption around __module_text_address().
exceptions: fork exception table content from module.h into extable.h
modules: Add kernel parameter to blacklist modules
module: Do a WARN_ON_ONCE() for assert module mutex not held
Documentation/module-signing.txt: Note need for version info if reusing a key
module: Invalidate signatures on force-loaded modules
module: Issue warnings when tainting kernel
module: fix redundant test.
module: fix noreturn attribute for __module_put_and_exit()
Includes GSI routing support to go along with the new VGIC and a small fix that
has been cooking in -next for a while.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXoydqAAoJEEtpOizt6ddyM3oH/1A4VeG/J9q4fBPXqY2tVWXs
c3P7UgNcrEgUNs/F9ykQY/lb31deecUzaBt1OyTf+RlsNbihq3dQdYcBhxtUODw/
Faok582ya3UFgLW+IRHcID0EbkVOpIzMhOStYsnU/Dz7HG1JL9HdPzwkid7iu9LT
fI6yrrBnJFjdWAAQ4BkcEKBENRsY8NTs7jX5vnFA92MkUBby7BmariPDD3FtrB+f
Ob9B7CxM30pNqsN7OA/QvFOHMJHxf3s1TBKwmPHe5TLIfSzV1YxcEGiMc0lWqF4v
BT8ZeMGCtjDw94tND1DskfQQRPaMqPmGuRTrAW/IuE2n92bFtbqIqs7Cbw0fzLE=
=Vm6Q
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-4.8-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/ARM Changes for v4.8 - Take 2
Includes GSI routing support to go along with the new VGIC and a small fix that
has been cooking in -next for a while.
Add ro_after_init support for modules by adding a new page-aligned section
in the module layout (after rodata) for ro_after_init data and enabling RO
protection for that section after module init runs.
Signed-off-by: Jessica Yu <jeyu@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Merge yet more updates from Andrew Morton:
- the rest of ocfs2
- various hotfixes, mainly MM
- quite a bit of misc stuff - drivers, fork, exec, signals, etc.
- printk updates
- firmware
- checkpatch
- nilfs2
- more kexec stuff than usual
- rapidio updates
- w1 things
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (111 commits)
ipc: delete "nr_ipc_ns"
kcov: allow more fine-grained coverage instrumentation
init/Kconfig: add clarification for out-of-tree modules
config: add android config fragments
init/Kconfig: ban CONFIG_LOCALVERSION_AUTO with allmodconfig
relay: add global mode support for buffer-only channels
init: allow blacklisting of module_init functions
w1:omap_hdq: fix regression
w1: add helper macro module_w1_family
w1: remove need for ida and use PLATFORM_DEVID_AUTO
rapidio/switches: add driver for IDT gen3 switches
powerpc/fsl_rio: apply changes for RIO spec rev 3
rapidio: modify for rev.3 specification changes
rapidio: change inbound window size type to u64
rapidio/idt_gen2: fix locking warning
rapidio: fix error handling in mbox request/release functions
rapidio/tsi721_dma: advance queue processing from transfer submit call
rapidio/tsi721: add messaging mbox selector parameter
rapidio/tsi721: add PCIe MRRS override parameter
rapidio/tsi721_dma: add channel mask and queue size parameters
...
Add channelized messaging driver to support native RapidIO messaging
exchange between multiple senders/recipients on devices that use kernel
RapidIO subsystem services.
This device driver is the result of collaboration within the RapidIO.org
Software Task Group (STG) between Texas Instruments, Prodrive
Technologies, Nokia Networks, BAE and IDT. Additional input was
received from other members of RapidIO.org.
The objective was to create a character mode driver interface which
exposes messaging capabilities of RapidIO endpoint devices (mports)
directly to applications, in a manner that allows the numerous and
varied RapidIO implementations to interoperate.
This char mode device driver allows user-space applications to setup
messaging communication channels using single shared RapidIO messaging
mailbox.
By default this driver uses RapidIO MBOX_1 (MBOX_0 is reserved for use by
RIONET Ethernet emulation driver).
[weiyj.lk@gmail.com: rapidio/rio_cm: fix return value check in riocm_init()]
Link: http://lkml.kernel.org/r/1469198221-21970-1-git-send-email-alexandre.bounine@idt.com
Link: http://lkml.kernel.org/r/1468952862-18056-1-git-send-email-alexandre.bounine@idt.com
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Tested-by: Barry Wood <barry.wood@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Aurelien Jacquiot <a-jacquiot@ti.com>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The header file "include/linux/nilfs2_fs.h" is composed of parts for
ioctl and disk format, and both are intended to be shared with user
space programs.
This moves them to the uapi directory "include/uapi/linux" splitting the
file to "nilfs2_api.h" and "nilfs2_ondisk.h". The following minor
changes are accompanied by this migration:
- nilfs_direct_node struct in nilfs2/direct.h is converged to
nilfs2_ondisk.h because it's an on-disk structure.
- inline functions nilfs_rec_len_from_disk() and
nilfs_rec_len_to_disk() are moved to nilfs2/dir.c.
Link: http://lkml.kernel.org/r/1465825507-3407-4-git-send-email-konishi.ryusuke@lab.ntt.co.jp
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
VGIC implementation.
- s390: support for trapping software breakpoints, nested virtualization
(vSIE), the STHYI opcode, initial extensions for CPU model support.
- MIPS: support for MIPS64 hosts (32-bit guests only) and lots of cleanups,
preliminary to this and the upcoming support for hardware virtualization
extensions.
- x86: support for execute-only mappings in nested EPT; reduced vmexit
latency for TSC deadline timer (by about 30%) on Intel hosts; support for
more than 255 vCPUs.
- PPC: bugfixes.
The ugly bit is the conflicts. A couple of them are simple conflicts due
to 4.7 fixes, but most of them are with other trees. There was definitely
too much reliance on Acked-by here. Some conflicts are for KVM patches
where _I_ gave my Acked-by, but the worst are for this pull request's
patches that touch files outside arch/*/kvm. KVM submaintainers should
probably learn to synchronize better with arch maintainers, with the
latter providing topic branches whenever possible instead of Acked-by.
This is what we do with arch/x86. And I should learn to refuse pull
requests when linux-next sends scary signals, even if that means that
submaintainers have to rebase their branches.
Anyhow, here's the list:
- arch/x86/kvm/vmx.c: handle_pcommit and EXIT_REASON_PCOMMIT was removed
by the nvdimm tree. This tree adds handle_preemption_timer and
EXIT_REASON_PREEMPTION_TIMER at the same place. In general all mentions
of pcommit have to go.
There is also a conflict between a stable fix and this patch, where the
stable fix removed the vmx_create_pml_buffer function and its call.
- virt/kvm/kvm_main.c: kvm_cpu_notifier was removed by the hotplug tree.
This tree adds kvm_io_bus_get_dev at the same place.
- virt/kvm/arm/vgic.c: a few final bugfixes went into 4.7 before the
file was completely removed for 4.8.
- include/linux/irqchip/arm-gic-v3.h: this one is entirely our fault;
this is a change that should have gone in through the irqchip tree and
pulled by kvm-arm. I think I would have rejected this kvm-arm pull
request. The KVM version is the right one, except that it lacks
GITS_BASER_PAGES_SHIFT.
- arch/powerpc: what a mess. For the idle_book3s.S conflict, the KVM
tree is the right one; everything else is trivial. In this case I am
not quite sure what went wrong. The commit that is causing the mess
(fd7bacbca4, "KVM: PPC: Book3S HV: Fix TB corruption in guest exit
path on HMI interrupt", 2016-05-15) touches both arch/powerpc/kernel/
and arch/powerpc/kvm/. It's large, but at 396 insertions/5 deletions
I guessed that it wasn't really possible to split it and that the 5
deletions wouldn't conflict. That wasn't the case.
- arch/s390: also messy. First is hypfs_diag.c where the KVM tree
moved some code and the s390 tree patched it. You have to reapply the
relevant part of commits 6c22c98637, plus all of e030c1125e, to
arch/s390/kernel/diag.c. Or pick the linux-next conflict
resolution from http://marc.info/?l=kvm&m=146717549531603&w=2.
Second, there is a conflict in gmap.c between a stable fix and 4.8.
The KVM version here is the correct one.
I have pushed my resolution at refs/heads/merge-20160802 (commit
3d1f53419842) at git://git.kernel.org/pub/scm/virt/kvm/kvm.git.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJXoGm7AAoJEL/70l94x66DugQIAIj703ePAFepB/fCrKHkZZia
SGrsBdvAtNsOhr7FQ5qvvjLxiv/cv7CymeuJivX8H+4kuUHUllDzey+RPHYHD9X7
U6n1PdCH9F15a3IXc8tDjlDdOMNIKJixYuq1UyNZMU6NFwl00+TZf9JF8A2US65b
x/41W98ilL6nNBAsoDVmCLtPNWAqQ3lajaZELGfcqRQ9ZGKcAYOaLFXHv2YHf2XC
qIDMf+slBGSQ66UoATnYV2gAopNlWbZ7n0vO6tE2KyvhHZ1m399aBX1+k8la/0JI
69r+Tz7ZHUSFtmlmyByi5IAB87myy2WQHyAPwj+4vwJkDGPcl0TrupzbG7+T05Y=
=42ti
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Paolo Bonzini:
- ARM: GICv3 ITS emulation and various fixes. Removal of the
old VGIC implementation.
- s390: support for trapping software breakpoints, nested
virtualization (vSIE), the STHYI opcode, initial extensions
for CPU model support.
- MIPS: support for MIPS64 hosts (32-bit guests only) and lots
of cleanups, preliminary to this and the upcoming support for
hardware virtualization extensions.
- x86: support for execute-only mappings in nested EPT; reduced
vmexit latency for TSC deadline timer (by about 30%) on Intel
hosts; support for more than 255 vCPUs.
- PPC: bugfixes.
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (302 commits)
KVM: PPC: Introduce KVM_CAP_PPC_HTM
MIPS: Select HAVE_KVM for MIPS64_R{2,6}
MIPS: KVM: Reset CP0_PageMask during host TLB flush
MIPS: KVM: Fix ptr->int cast via KVM_GUEST_KSEGX()
MIPS: KVM: Sign extend MFC0/RDHWR results
MIPS: KVM: Fix 64-bit big endian dynamic translation
MIPS: KVM: Fail if ebase doesn't fit in CP0_EBase
MIPS: KVM: Use 64-bit CP0_EBase when appropriate
MIPS: KVM: Set CP0_Status.KX on MIPS64
MIPS: KVM: Make entry code MIPS64 friendly
MIPS: KVM: Use kmap instead of CKSEG0ADDR()
MIPS: KVM: Use virt_to_phys() to get commpage PFN
MIPS: Fix definition of KSEGX() for 64-bit
KVM: VMX: Add VMCS to CPU's loaded VMCSs before VMPTRLD
kvm: x86: nVMX: maintain internal copy of current VMCS
KVM: PPC: Book3S HV: Save/restore TM state in H_CEDE
KVM: PPC: Book3S HV: Pull out TM state save/restore into separate procedures
KVM: arm64: vgic-its: Simplify MAPI error handling
KVM: arm64: vgic-its: Make vgic_its_cmd_handle_mapi similar to other handlers
KVM: arm64: vgic-its: Turn device_id validation into generic ID validation
...
This patch tries to implement an device IOTLB for vhost. This could be
used with userspace(qemu) implementation of DMA remapping
to emulate an IOMMU for the guest.
The idea is simple, cache the translation in a software device IOTLB
(which is implemented as an interval tree) in vhost and use vhost_net
file descriptor for reporting IOTLB miss and IOTLB
update/invalidation. When vhost meets an IOTLB miss, the fault
address, size and access can be read from the file. After userspace
finishes the translation, it writes the translated address to the
vhost_net file to update the device IOTLB.
When device IOTLB is enabled by setting VIRTIO_F_IOMMU_PLATFORM all vq
addresses set by ioctl are treated as iova instead of virtual address and
the accessing can only be done through IOTLB instead of direct userspace
memory access. Before each round or vq processing, all vq metadata is
prefetched in device IOTLB to make sure no translation fault happens
during vq processing.
In most cases, virtqueues are contiguous even in virtual address space.
The IOTLB translation for virtqueue itself may make it a little
slower. We might add fast path cache on top of this patch.
Signed-off-by: Jason Wang <jasowang@redhat.com>
[mst: use virtio feature bit: VHOST_F_DEVICE_IOTLB -> VIRTIO_F_IOMMU_PLATFORM ]
[mst: fix build warnings ]
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
[ weiyj.lk: missing unlock on error ]
Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com>
VM sockets vhost transport implementation. This driver runs on the
host.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This module contains the common code and header files for the following
virtio_transporto and vhost_vsock kernel modules.
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The interaction between virtio and IOMMUs is messy.
On most systems with virtio, physical addresses match bus addresses,
and it doesn't particularly matter which one we use to program
the device.
On some systems, including Xen and any system with a physical device
that speaks virtio behind a physical IOMMU, we must program the IOMMU
for virtio DMA to work at all.
On other systems, including SPARC and PPC64, virtio-pci devices are
enumerated as though they are behind an IOMMU, but the virtio host
ignores the IOMMU, so we must either pretend that the IOMMU isn't
there or somehow map everything as the identity.
Add a feature bit to detect that quirk: VIRTIO_F_IOMMU_PLATFORM.
Any device with this feature bit set to 0 needs a quirk and has to be
passed physical addresses (as opposed to bus addresses) even though
the device is behind an IOMMU.
Note: it has to be a per-device quirk because for example, there could
be a mix of passed-through and virtual virtio devices. As another
example, some devices could be implemented by an out of process
hypervisor backend (in case of qemu vhost, or vhost-user) and so support
for an IOMMU needs to be coded up separately.
It would be cleanest to handle this in IOMMU core code, but that needs
per-device DMA ops. While we are waiting for that to be implemented, use
a work-around in virtio core.
Note: a "noiommu" feature is a quirk - add a wrapper to make
that clear.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a new KVM capability, KVM_CAP_PPC_HTM, that can be queried to
determine if a PowerPC KVM guest should use HTM (Hardware Transactional
Memory).
This will be used by QEMU to populate the pa-features bits in the
guest's device tree.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds twelve ELF core note sections for powerpc
architecture for various registers and register sets which
need to be accessed from ptrace interface and then gdb.
These additions include special purpose registers like TAR,
PPR, DSCR, TM running and checkpointed state for various
register sets, EBB related register set, performance monitor
register set etc. Addition of these new ELF core note
sections extends the existing ELF ABI on powerpc arch without
affecting it in any manner.
Signed-off-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Pull security subsystem updates from James Morris:
"Highlights:
- TPM core and driver updates/fixes
- IPv6 security labeling (CALIPSO)
- Lots of Apparmor fixes
- Seccomp: remove 2-phase API, close hole where ptrace can change
syscall #"
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (156 commits)
apparmor: fix SECURITY_APPARMOR_HASH_DEFAULT parameter handling
tpm: Add TPM 2.0 support to the Nuvoton i2c driver (NPCT6xx family)
tpm: Factor out common startup code
tpm: use devm_add_action_or_reset
tpm2_i2c_nuvoton: add irq validity check
tpm: read burstcount from TPM_STS in one 32-bit transaction
tpm: fix byte-order for the value read by tpm2_get_tpm_pt
tpm_tis_core: convert max timeouts from msec to jiffies
apparmor: fix arg_size computation for when setprocattr is null terminated
apparmor: fix oops, validate buffer size in apparmor_setprocattr()
apparmor: do not expose kernel stack
apparmor: fix module parameters can be changed after policy is locked
apparmor: fix oops in profile_unpack() when policy_db is not present
apparmor: don't check for vmalloc_addr if kvzalloc() failed
apparmor: add missing id bounds check on dfa verification
apparmor: allow SYS_CAP_RESOURCE to be sufficient to prlimit another task
apparmor: use list_next_entry instead of list_entry_next
apparmor: fix refcount race when finding a child profile
apparmor: fix ref count leak when profile sha1 hash is read
apparmor: check that xindex is in trans_table bounds
...
1/ Replace pcommit with ADR / directed-flushing:
The pcommit instruction, which has not shipped on any product, is
deprecated. Instead, the requirement is that platforms implement either
ADR, or provide one or more flush addresses per nvdimm. ADR
(Asynchronous DRAM Refresh) flushes data in posted write buffers to the
memory controller on a power-fail event. Flush addresses are defined in
ACPI 6.x as an NVDIMM Firmware Interface Table (NFIT) sub-structure:
"Flush Hint Address Structure". A flush hint is an mmio address that
when written and fenced assures that all previous posted writes
targeting a given dimm have been flushed to media.
2/ On-demand ARS (address range scrub):
Linux uses the results of the ACPI ARS commands to track bad blocks
in pmem devices. When latent errors are detected we re-scrub the media
to refresh the bad block list, userspace can also request a re-scrub at
any time.
3/ Support for the Microsoft DSM (device specific method) command format.
4/ Support for EDK2/OVMF virtual disk device memory ranges.
5/ Various fixes and cleanups across the subsystem.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXmXBsAAoJEB7SkWpmfYgCEwwP/1IOt9ocP+iHLMDH9KE7VaTZ
NmUDR+Zy6g5cRQM7SgcuU5BXUcx+OsSrSrUTVF1cW994o9Gbz1mFotkv0ZAsPcYY
ZVRQxo2oqHrssyOcg+PsgKWiXn68rJOCgmpEyzaJywl5qTMst7pzsT1s1f7rSh6h
trCf4VaJJwxZR8fARGtlHUnnhPe2Orp99EZRKEWprAsIv2kPuWpPHSjRjuEgN1JG
KW8AYwWqFTtiLRUk86I4KBB0wcDrfctsjgN9Ogd6+aHyQBRnVSr2U+vDCFkC8KLu
qiDCpYp+yyxBjclnljz7tRRT3GtzfCUWd4v2KVWqgg2IaobUc0Lbukp/rmikUXQP
WLikT2OCQ994eFK5OX3Q3cIU/4j459TQnof8q14yVSpjAKrNUXVSR5puN7Hxa+V7
41wKrAsnsyY1oq+Yd/rMR8VfH7PHx3bFkrmRCGZCufLX1UQm4aYj+sWagDKiV3yA
DiudghbOnhfurfGsnXUVw7y7GKs+gNWNBmB6ndAD6ZEHmKoGUhAEbJDLCc3DnANl
b/2mv1MIdIcC1DlCmnbbcn6fv6bICe/r8poK3VrCK3UgOq/EOvKIWl7giP+k1JuC
6DdVYhlNYIVFXUNSLFAwz8OkLu8byx7WDm36iEqrKHtPw+8qa/2bWVgOU6OBgpjV
cN3edFVIdxvZeMgM5Ubq
=xCBG
-----END PGP SIGNATURE-----
Merge tag 'libnvdimm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm updates from Dan Williams:
- Replace pcommit with ADR / directed-flushing.
The pcommit instruction, which has not shipped on any product, is
deprecated. Instead, the requirement is that platforms implement
either ADR, or provide one or more flush addresses per nvdimm.
ADR (Asynchronous DRAM Refresh) flushes data in posted write buffers
to the memory controller on a power-fail event.
Flush addresses are defined in ACPI 6.x as an NVDIMM Firmware
Interface Table (NFIT) sub-structure: "Flush Hint Address Structure".
A flush hint is an mmio address that when written and fenced assures
that all previous posted writes targeting a given dimm have been
flushed to media.
- On-demand ARS (address range scrub).
Linux uses the results of the ACPI ARS commands to track bad blocks
in pmem devices. When latent errors are detected we re-scrub the
media to refresh the bad block list, userspace can also request a
re-scrub at any time.
- Support for the Microsoft DSM (device specific method) command
format.
- Support for EDK2/OVMF virtual disk device memory ranges.
- Various fixes and cleanups across the subsystem.
* tag 'libnvdimm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (41 commits)
libnvdimm-btt: Delete an unnecessary check before the function call "__nd_device_register"
nfit: do an ARS scrub on hitting a latent media error
nfit: move to nfit/ sub-directory
nfit, libnvdimm: allow an ARS scrub to be triggered on demand
libnvdimm: register nvdimm_bus devices with an nd_bus driver
pmem: clarify a debug print in pmem_clear_poison
x86/insn: remove pcommit
Revert "KVM: x86: add pcommit support"
nfit, tools/testing/nvdimm/: unify shutdown paths
libnvdimm: move ->module to struct nvdimm_bus_descriptor
nfit: cleanup acpi_nfit_init calling convention
nfit: fix _FIT evaluation memory leak + use after free
tools/testing/nvdimm: add manufacturing_{date|location} dimm properties
tools/testing/nvdimm: add virtual ramdisk range
acpi, nfit: treat virtual ramdisk SPA as pmem region
pmem: kill __pmem address space
pmem: kill wmb_pmem()
libnvdimm, pmem: use nvdimm_flush() for namespace I/O writes
fs/dax: remove wmb_pmem()
libnvdimm, pmem: flush posted-write queues on shutdown
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXmLuzAAoJEAhfPr2O5OEVXyUP/jzXfbc5e72ECimqZQnJeK3a
+MrJUrvp9W/7e99WUHwb2QoDYo/SDpXdbLbRjU0BTnPy47XS8J10MxsfE1eRnfds
+TAhw96Hs3Wm3A5OxQfqPsZmq3AtGpU+HJ/zzulpY0bTqEzAtD8oUO9ZjT39Eshz
xFbeIr6Kco6N6W4Ja998g4uBmvjN/24xOjSmuETja+gK2zztFPp70opefA2+crdY
ChOgdUOj6cPDZlddExGf9oVrEMJhM9tn2ACrbFsD0sWPvix145JbVXgaOdGrt2Jc
huRREv9tKte2/Os7IQ8wN33Jzca6RtdXg3twlauGIOjhYuaeVoaSCqQrRgKNjFl+
CWTucHGnCP82bfmFvQxCWkhGKGGMzcNdQlnjKM7PrGPrSwW9+FNpLzs5XDX5S0gj
UYkS/smbwzOOrfEcttFCMs5EGoAs+zOYzHtyCKp4EyWcznN5wRR8y3L3cdhmGAiG
jAqTh/orjSOnnM2qQXqt574IiGTVvJqGEZyLAWC16HKbRaEFzL/+9Z79lkoxQfPn
iLALAtTagFjxj/5ITxr5b2TzgWZ+GNFnyEpeC4DXqdESVfuZX00wiy0YEq1JFG+J
RIj3VQpDqUwIs5KMqGQDG/S58iCfR83AIJBpSFIWaCHByvZ3thL8H4og20oZ8Akt
TQXCqmzIvzn8QIZS/Pge
=CN+v
-----END PGP SIGNATURE-----
Merge tag 'media/v4.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media documentation updates from Mauro Carvalho Chehab:
"This patch series does the conversion of all media documentation stuff
to Restrutured Text markup format and add them to the
Documentation/index.rst file.
The media documentation was grouped into 4 books:
- media uAPI
- media kAPI
- V4L driver-specific documentation
- DVB driver-specific documentation
It also contains several documentation improvements and one fixup
patch for a core issue with cropcap.
PS. After this patch series, the media DocBook is deprecated and
should be removed. I'll add such patch on a future pull request"
* tag 'media/v4.8-4' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (322 commits)
[media] cx23885-cardlist.rst: add a new card
[media] doc-rst: add some needed escape codes
[media] doc-rst: kapi: use :c:func: instead of :cpp:func
doc-rst: kernel-doc: fix a change introduced by mistake
[media] v4l2-ioctl.h add debug info for struct v4l2_ioctl_ops
[media] dvb_ringbuffer.h: some documentation improvements
[media] v4l2-ctrls.h: fully document the header file
[media] doc-rst: Fix some typedef ugly warnings
[media] doc-rst: reorganize the kAPI v4l2 chapters
[media] rename v4l2-framework.rst to v4l2-intro.rst
[media] move V4L2 clocks to a separate .rst file
[media] v4l2-fh.rst: add cross references and markups
[media] v4l2-fh.rst: add fh contents from v4l2-framework.rst
[media] v4l2-fh.h: add documentation for it
[media] v4l2-event.rst: add cross-references and markups
[media] v4l2-event.h: document all functions
[media] v4l2-event.rst: add text from v4l2-framework.rst
[media] v4l2-framework.rst: remove videobuf quick chapter
[media] v4l2-dev: add cross-references and improve markup
[media] doc-rst: move v4l2-dev doc to a separate file
...
Pull i2c updates from Wolfram Sang:
"Here is the I2C pull request for 4.8:
- the core and i801 driver gained support for SMBus Host Notify
- core support for more than one address in DT
- i2c_add_adapter() has now better error messages. We can remove all
error messages from drivers calling it as a next step.
- bigger updates to rk3x driver to support rk3399 SoC
- the at24 eeprom driver got refactored and can now read special
variants with unique serials or fixed MAC addresses.
The rest is regular driver updates and bugfixes"
* 'i2c/for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (66 commits)
i2c: i801: use IS_ENABLED() instead of checking for built-in or module
Documentation: i2c: slave: give proper example for pm usage
Documentation: i2c: slave: describe buffer problems a bit better
i2c: bcm2835: Don't complain on -EPROBE_DEFER from getting our clock
i2c: i2c-smbus: drop useless stubs
i2c: efm32: fix a failure path in efm32_i2c_probe()
Revert "i2c: core: Cleanup I2C ACPI namespace"
Revert "i2c: core: Add function for finding the bus speed from ACPI"
i2c: Update the description of I2C_SMBUS
i2c: i2c-smbus: fix i2c_handle_smbus_host_notify documentation
eeprom: at24: tweak the loop_until_timeout() macro
eeprom: at24: add support for at24mac series
eeprom: at24: support reading the serial number for 24csxx
eeprom: at24: platform_data: use BIT() macro
eeprom: at24: split at24_eeprom_write() into specialized functions
eeprom: at24: split at24_eeprom_read() into specialized functions
eeprom: at24: hide the read/write loop behind a macro
eeprom: at24: call read/write functions via function pointers
eeprom: at24: coding style fixes
eeprom: at24: move at24_read() below at24_eeprom_write()
...
Pull networking updates from David Miller:
1) Unified UDP encapsulation offload methods for drivers, from
Alexander Duyck.
2) Make DSA binding more sane, from Andrew Lunn.
3) Support QCA9888 chips in ath10k, from Anilkumar Kolli.
4) Several workqueue usage cleanups, from Bhaktipriya Shridhar.
5) Add XDP (eXpress Data Path), essentially running BPF programs on RX
packets as soon as the device sees them, with the option to mirror
the packet on TX via the same interface. From Brenden Blanco and
others.
6) Allow qdisc/class stats dumps to run lockless, from Eric Dumazet.
7) Add VLAN support to b53 and bcm_sf2, from Florian Fainelli.
8) Simplify netlink conntrack entry layout, from Florian Westphal.
9) Add ipv4 forwarding support to mlxsw spectrum driver, from Ido
Schimmel, Yotam Gigi, and Jiri Pirko.
10) Add SKB array infrastructure and convert tun and macvtap over to it.
From Michael S Tsirkin and Jason Wang.
11) Support qdisc packet injection in pktgen, from John Fastabend.
12) Add neighbour monitoring framework to TIPC, from Jon Paul Maloy.
13) Add NV congestion control support to TCP, from Lawrence Brakmo.
14) Add GSO support to SCTP, from Marcelo Ricardo Leitner.
15) Allow GRO and RPS to function on macsec devices, from Paolo Abeni.
16) Support MPLS over IPV4, from Simon Horman.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1622 commits)
xgene: Fix build warning with ACPI disabled.
be2net: perform temperature query in adapter regardless of its interface state
l2tp: Correctly return -EBADF from pppol2tp_getname.
net/mlx5_core/health: Remove deprecated create_singlethread_workqueue
net: ipmr/ip6mr: update lastuse on entry change
macsec: ensure rx_sa is set when validation is disabled
tipc: dump monitor attributes
tipc: add a function to get the bearer name
tipc: get monitor threshold for the cluster
tipc: make cluster size threshold for monitoring configurable
tipc: introduce constants for tipc address validation
net: neigh: disallow transition to NUD_STALE if lladdr is unchanged in neigh_update()
MAINTAINERS: xgene: Add driver and documentation path
Documentation: dtb: xgene: Add MDIO node
dtb: xgene: Add MDIO node
drivers: net: xgene: ethtool: Use phy_ethtool_gset and sset
drivers: net: xgene: Use exported functions
drivers: net: xgene: Enable MDIO driver
drivers: net: xgene: Add backward compatibility
drivers: net: phy: xgene: Add MDIO driver
...
- Kexec support for arm64
- Kprobes support
- Expose MIDR_EL1 and REVIDR_EL1 CPU identification registers to sysfs
- Trapping of user space cache maintenance operations and emulation in
the kernel (CPU errata workaround)
- Clean-up of the early page tables creation (kernel linear mapping, EFI
run-time maps) to avoid splitting larger blocks (e.g. pmds) into
smaller ones (e.g. ptes)
- VDSO support for CLOCK_MONOTONIC_RAW in clock_gettime()
- ARCH_HAS_KCOV enabled for arm64
- Optimise IP checksum helpers
- SWIOTLB optimisation to only allocate/initialise the buffer if the
available RAM is beyond the 32-bit mask
- Properly handle the "nosmp" command line argument
- Fix for the initialisation of the CPU debug state during early boot
- vdso-offsets.h build dependency workaround
- Build fix when RANDOMIZE_BASE is enabled with MODULES off
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXmF/UAAoJEGvWsS0AyF7x+jwP/2fErtX6FTXmdG0c3HBkTpuy
gEuzN2ByWbP6Io+unLC6NvbQQb1q6c73PTqjsoeMHUx2o8YK3jgWEBcC+7AuepoZ
YGl3r08e75a/fGrgNwEQQC1lNlgjpog4kzVDh5ji6oRXNq+OkjJGUtRPe3gBoqxv
NAjviciID/MegQaq4SaMd26AmnjuUGKogo5vlIaXK0SemX9it+ytW7eLAXuVY+gW
EvO3Nxk0Y5oZKJF8qRw6oLSmw1bwn2dD26OgfXfCiI30QBookRyWIoXRedUOZmJq
D0+Tipd7muO4PbjlxS8aY/wd/alfnM5+TJ6HpGDo+Y1BDauXfiXMf3ktDFE5QvJB
KgtICmC0stWwbDT35dHvz8sETsrCMA2Q/IMrnyxG+nj9BxVQU7rbNrxfCXesJy7Q
4EsQbcTyJwu+ECildBezfoei99XbFZyWk2vKSkTCFKzgwXpftGFaffgZ3DIzBAHH
IjecDqIFENC8ymrjyAgrGjeFG+2WB/DBgoSS3Baiz6xwQqC4wFMnI3jPECtJjb/U
6e13f+onXu5lF1YFKAiRjGmqa/G1ZMr+uKZFsembuGqsZdAPkzzUHyAE9g4JVO8p
t3gc3/M3T7oLSHuw4xi1/Ow5VGb2UvbslFrp7OpuFZ7CJAvhKlHL5rPe385utsFE
7++5WHXHAegeJCDNAKY2
=iJOY
-----END PGP SIGNATURE-----
Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 updates from Catalin Marinas:
- Kexec support for arm64
- Kprobes support
- Expose MIDR_EL1 and REVIDR_EL1 CPU identification registers to sysfs
- Trapping of user space cache maintenance operations and emulation in
the kernel (CPU errata workaround)
- Clean-up of the early page tables creation (kernel linear mapping,
EFI run-time maps) to avoid splitting larger blocks (e.g. pmds) into
smaller ones (e.g. ptes)
- VDSO support for CLOCK_MONOTONIC_RAW in clock_gettime()
- ARCH_HAS_KCOV enabled for arm64
- Optimise IP checksum helpers
- SWIOTLB optimisation to only allocate/initialise the buffer if the
available RAM is beyond the 32-bit mask
- Properly handle the "nosmp" command line argument
- Fix for the initialisation of the CPU debug state during early boot
- vdso-offsets.h build dependency workaround
- Build fix when RANDOMIZE_BASE is enabled with MODULES off
* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (64 commits)
arm64: arm: Fix-up the removal of the arm64 regs_query_register_name() prototype
arm64: Only select ARM64_MODULE_PLTS if MODULES=y
arm64: mm: run pgtable_page_ctor() on non-swapper translation table pages
arm64: mm: make create_mapping_late() non-allocating
arm64: Honor nosmp kernel command line option
arm64: Fix incorrect per-cpu usage for boot CPU
arm64: kprobes: Add KASAN instrumentation around stack accesses
arm64: kprobes: Cleanup jprobe_return
arm64: kprobes: Fix overflow when saving stack
arm64: kprobes: WARN if attempting to step with PSTATE.D=1
arm64: debug: remove unused local_dbg_{enable, disable} macros
arm64: debug: remove redundant spsr manipulation
arm64: debug: unmask PSTATE.D earlier
arm64: localise Image objcopy flags
arm64: ptrace: remove extra define for CPSR's E bit
kprobes: Add arm64 case in kprobe example module
arm64: Add kernel return probes support (kretprobes)
arm64: Add trampoline code for kretprobes
arm64: kprobes instruction simulation support
arm64: Treat all entry code as non-kprobe-able
...
* topic/docs-next: (322 commits)
[media] cx23885-cardlist.rst: add a new card
[media] doc-rst: add some needed escape codes
[media] doc-rst: kapi: use :c:func: instead of :cpp:func
doc-rst: kernel-doc: fix a change introduced by mistake
[media] v4l2-ioctl.h add debug info for struct v4l2_ioctl_ops
[media] dvb_ringbuffer.h: some documentation improvements
[media] v4l2-ctrls.h: fully document the header file
[media] doc-rst: Fix some typedef ugly warnings
[media] doc-rst: reorganize the kAPI v4l2 chapters
[media] rename v4l2-framework.rst to v4l2-intro.rst
[media] move V4L2 clocks to a separate .rst file
[media] v4l2-fh.rst: add cross references and markups
[media] v4l2-fh.rst: add fh contents from v4l2-framework.rst
[media] v4l2-fh.h: add documentation for it
[media] v4l2-event.rst: add cross-references and markups
[media] v4l2-event.h: document all functions
[media] v4l2-event.rst: add text from v4l2-framework.rst
[media] v4l2-framework.rst: remove videobuf quick chapter
[media] v4l2-dev: add cross-references and improve markup
[media] doc-rst: move v4l2-dev doc to a separate file
...
Merge updates from Andrew Morton:
- a few misc bits
- ocfs2
- most(?) of MM
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (125 commits)
thp: fix comments of __pmd_trans_huge_lock()
cgroup: remove unnecessary 0 check from css_from_id()
cgroup: fix idr leak for the first cgroup root
mm: memcontrol: fix documentation for compound parameter
mm: memcontrol: remove BUG_ON in uncharge_list
mm: fix build warnings in <linux/compaction.h>
mm, thp: convert from optimistic swapin collapsing to conservative
mm, thp: fix comment inconsistency for swapin readahead functions
thp: update Documentation/{vm/transhuge,filesystems/proc}.txt
shmem: split huge pages beyond i_size under memory pressure
thp: introduce CONFIG_TRANSPARENT_HUGE_PAGECACHE
khugepaged: add support of collapse for tmpfs/shmem pages
shmem: make shmem_inode_info::lock irq-safe
khugepaged: move up_read(mmap_sem) out of khugepaged_alloc_page()
thp: extract khugepaged from mm/huge_memory.c
shmem, thp: respect MADV_{NO,}HUGEPAGE for file mappings
shmem: add huge pages support
shmem: get_unmapped_area align huge page
shmem: prepare huge= mount option and sysfs knob
mm, rmap: account shmem thp pages
...
Core changes:
- The big item is of course the completion of the character
device ABI. It has now replaced and surpassed the former
unmaintainable sysfs ABI: we can now hammer (bitbang)
individual lines or sets of lines and read individual lines
or sets of lines from userspace, and we can also register
to listen to GPIO events from userspace. As a tie-in we
have two new tools in tools/gpio: gpio-hammer and
gpio-event-mon that illustrate the proper use of the new
ABI. As someone said: the wild west days of GPIO are now
over.
- Continued to remove the pointless
ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB Kconfig symbols.
I'm patching hexagon, openrisc, powerpc, sh, unicore,
ia64 and microblaze. These are either ACKed by their
maintainers or patched anyways after a grace period and
no response from maintainers. Some archs (ARM) come in from
their trees, and others (x86) are still not fixed, so I
might send a second pull request to root it out later in
this merge window, or just defer to v4.9.
- The GPIO tools are moved to the tools build system.
New drivers:
- New driver for the MAX77620/MAX20024.
- New driver for the Intel Merrifield.
- Enabled PCA953x for the TI PCA9536.
- Enabled PCA953x for the Intel Edison.
- Enabled R8A7792 in the RCAR driver.
Driver improvements:
- The STMPE and F7188x now supports the .get_direction()
callback.
- The Xilinx driver supports setting multiple lines at
once.
- ACPI support for the Vulcan GPIO controller.
- The MMIO GPIO driver supports device tree probing.
- The Acer One 10 is supported through the _DEP ACPI
attribute.
Cleanups:
- A major cleanup of the OF/DT support code. It is way
easier to read and understand now, probably this improves
performance too.
- Drop a few redundant .owner assignments.
- Remove CLPS711x boardfile support: we are 100% DT.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXlcT4AAoJEEEQszewGV1zACwQAK5SZr0F5c3QvYbJSiJBCGA7
MZKUYHnYoBpZaPKcFKoOXEM1WOvlABlh9U0y0xkL8gQ6giyKup1wYJJCuYgW29gL
ny4r7Z8rs2Wm1ujL+FLAwuxIwCY3BnhUucp8YiSaHPBuKRfsHorFPvXiAgLZjNYC
Qk3Q48xYW4inw9sy2BbMfsU3CZnkvgy5euooyy1ezwachRhuHdBy/MVCG012PC4s
0d6LGdByEx1uK4NeV7ssPys444M8unep2EWgy6Rvc1U+FmGA487EvL+X8nxTQTj3
uTMxA8nddmZTEeEIqhpRw/dPiFlWxPFwfWmNEre05gKLb/LUK2tgsUOnmIFgVUw/
t41IzdQNLQQZxmiXplZn6s5mAr2VNuTxkRq1CIl4SwQW+Uy4TU3q8aDPkKzsyhiR
yw6o6ul0pQs8UZEggnht8ie6JiSnJ55ehI/nlRxpK/797Ff6Yp4FARs3ZtFnQDDu
SWewnbRatZQ89lvy4BA7QCWeV4Scjk4k/e2HjUAFnkfMDaYqpi4vTdzwnWdVjd+F
hMgu6VnkN3oSE7ZMrKJMh7b7h1uMnIwKBFWbkrlOEuhT1X0ZDsEOBv5juSBPYomN
EOIJUyWqxn0ZfxeONbdbCPteYlfJF+TW/rE9LQMxS1nNwsqw2IQW6NCmrM9Nx6Fv
FP++26nYMTSh82gwOYw3
=NwcK
-----END PGP SIGNATURE-----
Merge tag 'gpio-v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio
Pull GPIO updates from Linus Walleij:
"This is the bulk of GPIO changes for the v4.8 kernel cycle. The big
news is the completion of the chardev ABI which I'm very happy about
and apart from that it's an ordinary, quite busy cycle. The details
are below.
The patches are tested in linux-next for some time, patches to other
subsystem mostly have ACKs.
I got overly ambitious with configureing lines as input for IRQ lines
but it turns out that some controllers have their interrupt-enable and
input-enabling in orthogonal settings so the assumption that all IRQ
lines are input lines does not hold. Oh well, revert and back to the
drawing board with that.
Core changes:
- The big item is of course the completion of the character device
ABI. It has now replaced and surpassed the former unmaintainable
sysfs ABI: we can now hammer (bitbang) individual lines or sets of
lines and read individual lines or sets of lines from userspace,
and we can also register to listen to GPIO events from userspace.
As a tie-in we have two new tools in tools/gpio: gpio-hammer and
gpio-event-mon that illustrate the proper use of the new ABI. As
someone said: the wild west days of GPIO are now over.
- Continued to remove the pointless ARCH_[WANT_OPTIONAL|REQUIRE]_GPIOLIB
Kconfig symbols. I'm patching hexagon, openrisc, powerpc, sh,
unicore, ia64 and microblaze. These are either ACKed by their
maintainers or patched anyways after a grace period and no response
from maintainers.
Some archs (ARM) come in from their trees, and others (x86) are
still not fixed, so I might send a second pull request to root it
out later in this merge window, or just defer to v4.9.
- The GPIO tools are moved to the tools build system.
New drivers:
- New driver for the MAX77620/MAX20024.
- New driver for the Intel Merrifield.
- Enabled PCA953x for the TI PCA9536.
- Enabled PCA953x for the Intel Edison.
- Enabled R8A7792 in the RCAR driver.
Driver improvements:
- The STMPE and F7188x now supports the .get_direction() callback.
- The Xilinx driver supports setting multiple lines at once.
- ACPI support for the Vulcan GPIO controller.
- The MMIO GPIO driver supports device tree probing.
- The Acer One 10 is supported through the _DEP ACPI attribute.
Cleanups:
- A major cleanup of the OF/DT support code. It is way easier to
read and understand now, probably this improves performance too.
- Drop a few redundant .owner assignments.
- Remove CLPS711x boardfile support: we are 100% DT"
* tag 'gpio-v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (67 commits)
MAINTAINERS: Add INTEL MERRIFIELD GPIO entry
gpio: dwapb: add missing fwnode_handle_put() in dwapb_gpio_get_pdata()
gpio: merrifield: Protect irq_ack() and gpio_set() by lock
gpio: merrifield: Introduce GPIO driver to support Merrifield
gpio: intel-mid: Make it depend to X86_INTEL_MID
gpio: intel-mid: Sort header block alphabetically
gpio: intel-mid: Remove potentially harmful code
gpio: rcar: add R8A7792 support
gpiolib: remove duplicated include from gpiolib.c
Revert "gpio: convince line to become input in irq helper"
gpiolib: of_find_gpio(): Don't discard errors
gpio: of: Allow overriding the device node
gpio: free handles in fringe cases
gpio: tps65218: Add platform_device_id table
gpio: max77620: get gpio value based on direction
gpio: lynxpoint: avoid potential warning on error path
tools/gpio: add install section
tools/gpio: move to tools buildsystem
gpio: intel-mid: switch to devm_gpiochip_add_data()
gpio: 74x164: Use spi_write() helper instead of open coding
...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXlfJvAAoJEAhfPr2O5OEVtLUP/RpCQ+W3YVryIdmLkdmYXoY7
m2rXtUh7GmzBjaBkFzbRCGZtgROF7zl0e1R3nm4tLbCV4Becw8HO7YiMjqFJm9xr
b6IngIyshsHf60Eii3RpLqUFvYrc/DDIMeYf8miwj/PvFAfI2BV9apraexJlpUuI
wdyi28cfBHq4WYhubaXKoAyBQ8YRA/t8KNRAkDlifaOaMbSAxWHlmqoSmJWeQx73
KHkSvbRPu4Hjo3R6q/ab8VhqmXeSnbqnQB9lgnxz7AmAZGhOlMYeAhV/K2ZwbBH8
swv36RmJVO59Ov+vNR4p7GGGDL3+qk8JLj4LNVVfOcW0A+t7WrPQEmrL6VsyaZAy
/+r4NEOcQN6Z5nFwbr3E0tYJ2Y5jFHOvsBfKd3EEGwty+hCl634akgb0vqtg06cg
E2KG+XW983RBadVwEBnEudxJb0fWPWHGhXEqRrwOD+718FNmTqYM6dEvTEyxRup8
EtCLj+eQQ4LmAyZxWyE8A+keKoMFQlHqk9LN9vQ7t7Wxq9mQ+V2l12T/lN4VhdTq
4QZ4mrCMCGEvNcNzgSg6R/9lVb6RHDtMXZ3htbB/w+5xET/IKIANYyg1Hr7ahtdh
rTW/4q6n3jtsu6tp5poteFvPzZKAblbrj2EptVzZYkonQ5BeAUisFTtneUL10Jmj
EUf/sH0fqoOA0VvV6Tu+
=mrOW
-----END PGP SIGNATURE-----
Merge tag 'media/v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media updates from Mauro Carvalho Chehab:
- new framework support for HDMI CEC and remote control support
- new encoding codec driver for Mediatek SoC
- new frontend driver: helene tuner
- added support for NetUp almost universal devices, with supports
DVB-C/S/S2/T/T2 and ISDB-T
- the mn88472 frontend driver got promoted from staging
- a new driver for RCar video input
- some soc_camera legacy drivers got removed: timb, omap1, mx2, mx3
- lots of driver cleanups, improvements and fixups
* tag 'media/v4.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (377 commits)
[media] cec: always check all_device_types and features
[media] cec: poll should check if there is room in the tx queue
[media] vivid: support monitor all mode
[media] cec: fix test for unconfigured adapter in main message loop
[media] cec: limit the size of the transmit queue
[media] cec: zero unused msg part after msg->len
[media] cec: don't set fh to NULL in CEC_TRANSMIT
[media] cec: clear all status fields before transmit and always fill in sequence
[media] cec: CEC_RECEIVE overwrote the timeout field
[media] cxd2841er: Reading SNR for DVB-C added
[media] cxd2841er: Reading BER and UCB for DVB-C added
[media] cxd2841er: fix switch-case for DVB-C
[media] cxd2841er: fix signal strength scale for ISDB-T
[media] cxd2841er: adjust the dB scale for DVB-C
[media] cxd2841er: provide signal strength for DVB-C
[media] cxd2841er: fix BER report via DVBv5 stats API
[media] mb86a20s: apply mask to val after checking for read failure
[media] airspy: fix error logic during device register
[media] s5p-cec/TODO: add TODO item
[media] cec/TODO: drop comment about sphinx documentation
...
later merged with 'for-4.8/core' to pickup the QUEUE_FLAG_DAX commits
that DM depends on to provide its DAX support
- clean up the bio-based vs request-based DM core code by moving the
request-based DM core code out to dm-rq.[hc]
- reinstate bio-based support in the DM multipath target (done with the
idea that fast storage like NVMe over Fabrics could benefit) -- while
preserving support for request_fn and blk-mq request-based DM mpath
- SCSI and DM multipath persistent reservation fixes that were
coordinated with Martin Petersen.
- the DM raid target saw the most extensive change this cycle; it now
provides reshape and takeover support (by layering ontop of the
corresponding MD capabilities)
- DAX support for DM core and the linear, stripe and error targets
- A DM thin-provisioning block discard vs allocation race fix that
addresses potential for corruption
- A stable fix for DM verity-fec's block calculation during decode
- A few cleanups and fixes to DM core and various targets
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXkRZmAAoJEMUj8QotnQNat2wH/i4LpkoGI5tI6UhyKWxRkzJp
vKaJ0zuZ2Ez73DucJujNuvaiyHq1IjHD5pfr8JQO3E8ygDkRC2KjF2O8EXp0Has6
U1uLahQej72MAs0ZJTpvfE+JiY6qyIl4K+xxuPmYm2f2S5TWTIgOetYjJQmcMlQo
Y8zFfcDYn4Dv5rMdvDT4+1ePETxq74wcBwTxyW3OAbHE1f0JjsUGdMKzXB1iTWcM
VjLjWI//ETfFdIlDO0w2Qbd90aLUjmTR2k67RGnbPj5kNUNikv/X6iiY32KERR/0
vMiiJ7JS+a44P7FJqCMoAVM/oBYFiSNpS4LYevOgHb0G0ikF8kaSeqBPC6sMYvg=
=uYt9
-----END PGP SIGNATURE-----
Merge tag 'dm-4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper updates from Mike Snitzer:
- initially based on Jens' 'for-4.8/core' (given all the flag churn)
and later merged with 'for-4.8/core' to pickup the QUEUE_FLAG_DAX
commits that DM depends on to provide its DAX support
- clean up the bio-based vs request-based DM core code by moving the
request-based DM core code out to dm-rq.[hc]
- reinstate bio-based support in the DM multipath target (done with the
idea that fast storage like NVMe over Fabrics could benefit) -- while
preserving support for request_fn and blk-mq request-based DM mpath
- SCSI and DM multipath persistent reservation fixes that were
coordinated with Martin Petersen.
- the DM raid target saw the most extensive change this cycle; it now
provides reshape and takeover support (by layering ontop of the
corresponding MD capabilities)
- DAX support for DM core and the linear, stripe and error targets
- a DM thin-provisioning block discard vs allocation race fix that
addresses potential for corruption
- a stable fix for DM verity-fec's block calculation during decode
- a few cleanups and fixes to DM core and various targets
* tag 'dm-4.8-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (73 commits)
dm: allow bio-based table to be upgraded to bio-based with DAX support
dm snap: add fake origin_direct_access
dm stripe: add DAX support
dm error: add DAX support
dm linear: add DAX support
dm: add infrastructure for DAX support
dm thin: fix a race condition between discarding and provisioning a block
dm btree: fix a bug in dm_btree_find_next_single()
dm raid: fix random optimal_io_size for raid0
dm raid: address checkpatch.pl complaints
dm: call PR reserve/unreserve on each underlying device
sd: don't use the ALL_TG_PT bit for reservations
dm: fix second blk_delay_queue() parameter to be in msec units not jiffies
dm raid: change logical functions to actually return bool
dm raid: use rdev_for_each in status
dm raid: use rs->raid_disks to avoid memory leaks on free
dm raid: support delta_disks for raid1, fix table output
dm raid: enhance reshape check and factor out reshape setup
dm raid: allow resize during recovery
dm raid: fix rs_is_recovering() to allow for lvextend
...
This patch introduces run-time migration feature for zspage.
For migration, VM uses page.lru field so it would be better to not use
page.next field which is unified with page.lru for own purpose. For
that, firstly, we can get first object offset of the page via runtime
calculation instead of using page.index so we can use page.index as link
for page chaining instead of page.next.
In case of huge object, it stores handle to page.index instead of next
link of page chaining because huge object doesn't need to next link for
page chaining. So get_next_page need to identify huge object to return
NULL. For it, this patch uses PG_owner_priv_1 flag of the page flag.
For migration, it supports three functions
* zs_page_isolate
It isolates a zspage which includes a subpage VM want to migrate from
class so anyone cannot allocate new object from the zspage.
We could try to isolate a zspage by the number of subpage so subsequent
isolation trial of other subpage of the zpsage shouldn't fail. For
that, we introduce zspage.isolated count. With that, zs_page_isolate
can know whether zspage is already isolated or not for migration so if
it is isolated for migration, subsequent isolation trial can be
successful without trying further isolation.
* zs_page_migrate
First of all, it holds write-side zspage->lock to prevent migrate other
subpage in zspage. Then, lock all objects in the page VM want to
migrate. The reason we should lock all objects in the page is due to
race between zs_map_object and zs_page_migrate.
zs_map_object zs_page_migrate
pin_tag(handle)
obj = handle_to_obj(handle)
obj_to_location(obj, &page, &obj_idx);
write_lock(&zspage->lock)
if (!trypin_tag(handle))
goto unpin_object
zspage = get_zspage(page);
read_lock(&zspage->lock);
If zs_page_migrate doesn't do trypin_tag, zs_map_object's page can be
stale by migration so it goes crash.
If it locks all of objects successfully, it copies content from old page
to new one, finally, create new zspage chain with new page. And if it's
last isolated subpage in the zspage, put the zspage back to class.
* zs_page_putback
It returns isolated zspage to right fullness_group list if it fails to
migrate a page. If it find a zspage is ZS_EMPTY, it queues zspage
freeing to workqueue. See below about async zspage freeing.
This patch introduces asynchronous zspage free. The reason to need it
is we need page_lock to clear PG_movable but unfortunately, zs_free path
should be atomic so the apporach is try to grab page_lock. If it got
page_lock of all of pages successfully, it can free zspage immediately.
Otherwise, it queues free request and free zspage via workqueue in
process context.
If zs_free finds the zspage is isolated when it try to free zspage, it
delays the freeing until zs_page_putback finds it so it will free free
the zspage finally.
In this patch, we expand fullness_list from ZS_EMPTY to ZS_FULL. First
of all, it will use ZS_EMPTY list for delay freeing. And with adding
ZS_FULL list, it makes to identify whether zspage is isolated or not via
list_empty(&zspage->list) test.
[minchan@kernel.org: zsmalloc: keep first object offset in struct page]
Link: http://lkml.kernel.org/r/1465788015-23195-1-git-send-email-minchan@kernel.org
[minchan@kernel.org: zsmalloc: zspage sanity check]
Link: http://lkml.kernel.org/r/20160603010129.GC3304@bbox
Link: http://lkml.kernel.org/r/1464736881-24886-12-git-send-email-minchan@kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Now, VM has a feature to migrate non-lru movable pages so balloon
doesn't need custom migration hooks in migrate.c and compaction.c.
Instead, this patch implements the page->mapping->a_ops->
{isolate|migrate|putback} functions.
With that, we could remove hooks for ballooning in general migration
functions and make balloon compaction simple.
[akpm@linux-foundation.org: compaction.h requires that the includer first include node.h]
Link: http://lkml.kernel.org/r/1464736881-24886-4-git-send-email-minchan@kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@profitbricks.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
In this commit, we dump the monitor attributes when queried.
The link monitor attributes are separated into two kinds:
1. general attributes per bearer
2. specific attributes per node/peer
This style resembles the socket attributes and the nametable
publications per socket.
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In this commit, we add support to fetch the configured
cluster monitoring threshold.
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In this commit, we introduce support to configure the minimum
threshold to activate the new link monitoring algorithm.
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In this commit, we introduce defines for tipc address size,
offset and mask specification for Zone.Cluster.Node.
There is no functional change in this commit.
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull crypto updates from Herbert Xu:
"Here is the crypto update for 4.8:
API:
- first part of skcipher low-level conversions
- add KPP (Key-agreement Protocol Primitives) interface.
Algorithms:
- fix IPsec/cryptd reordering issues that affects aesni
- RSA no longer does explicit leading zero removal
- add SHA3
- add DH
- add ECDH
- improve DRBG performance by not doing CTR by hand
Drivers:
- add x86 AVX2 multibuffer SHA256/512
- add POWER8 optimised crc32c
- add xts support to vmx
- add DH support to qat
- add RSA support to caam
- add Layerscape support to caam
- add SEC1 AEAD support to talitos
- improve performance by chaining requests in marvell/cesa
- add support for Araneus Alea I USB RNG
- add support for Broadcom BCM5301 RNG
- add support for Amlogic Meson RNG
- add support Broadcom NSP SoC RNG"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (180 commits)
crypto: vmx - Fix aes_p8_xts_decrypt build failure
crypto: vmx - Ignore generated files
crypto: vmx - Adding support for XTS
crypto: vmx - Adding asm subroutines for XTS
crypto: skcipher - add comment for skcipher_alg->base
crypto: testmgr - Print akcipher algorithm name
crypto: marvell - Fix wrong flag used for GFP in mv_cesa_dma_add_iv_op
crypto: nx - off by one bug in nx_of_update_msc()
crypto: rsa-pkcs1pad - fix rsa-pkcs1pad request struct
crypto: scatterwalk - Inline start/map/done
crypto: scatterwalk - Remove unnecessary BUG in scatterwalk_start
crypto: scatterwalk - Remove unnecessary advance in scatterwalk_pagedone
crypto: scatterwalk - Fix test in scatterwalk_done
crypto: api - Optimise away crypto_yield when hard preemption is on
crypto: scatterwalk - add no-copy support to copychunks
crypto: scatterwalk - Remove scatterwalk_bytes_sglen
crypto: omap - Stop using crypto scatterwalk_bytes_sglen
crypto: skcipher - Remove top-level givcipher interface
crypto: user - Remove crypto_lookup_skcipher call
crypto: cts - Convert to skcipher
...
BTRFS_IOC_LOGICAL_INO takes a btrfs_ioctl_logical_ino_args as argument,
not a btrfs_ioctl_ino_path_args. The lines were probably copy/pasted
when the code was written.
Since btrfs_ioctl_logical_ino_args and btrfs_ioctl_ino_path_args have
the same size, the actual IOCTL definition here does not change.
But, it makes the code less confusing for the reader.
Signed-off-by: Hans van Kranenburg <hans.van.kranenburg@mendix.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This allows user memory to be written to during the course of a kprobe.
It shouldn't be used to implement any kind of security mechanism
because of TOC-TOU attacks, but rather to debug, divert, and
manipulate execution of semi-cooperative processes.
Although it uses probe_kernel_write, we limit the address space
the probe can write into by checking the space with access_ok.
We do this as opposed to calling copy_to_user directly, in order
to avoid sleeping. In addition we ensure the threads's current fs
/ segment is USER_DS and the thread isn't exiting nor a kernel thread.
Given this feature is meant for experiments, and it has a risk of
crashing the system, and running programs, we print a warning on
when a proglet that attempts to use this helper is installed,
along with the pid and process name.
Signed-off-by: Sargun Dhillon <sargun@sargun.me>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull perf updates from Ingo Molnar:
"With over 300 commits it's been a busy cycle - with most of the work
concentrated on the tooling side (as it should).
The main kernel side enhancements were:
- Add per event callchain limit: Recently we introduced a sysctl to
tune the max-stack for all events for which callchains were
requested:
$ sysctl kernel.perf_event_max_stack
kernel.perf_event_max_stack = 127
Now this patch introduces a way to configure this per event, i.e.
this becomes possible:
$ perf record -e sched:*/max-stack=2/ -e block:*/max-stack=10/ -a
allowing finer tuning of how much buffer space callchains use.
This uses an u16 from the reserved space at the end, leaving
another u16 for future use.
There has been interest in even finer tuning, namely to control the
max stack for kernel and userspace callchains separately. Further
discussion is needed, we may for instance use the remaining u16 for
that and when it is present, assume that the sample_max_stack
introduced in this patch applies for the kernel, and the u16 left
is used for limiting the userspace callchain (Arnaldo Carvalho de
Melo)
- Optimize AUX event (hardware assisted side-band event) delivery
(Kan Liang)
- Rework Intel family name macro usage (this is partially x86 arch
work) (Dave Hansen)
- Refine and fix Intel LBR support (David Carrillo-Cisneros)
- Add support for Intel 'TopDown' events (Andi Kleen)
- Intel uncore PMU driver fixes and enhancements (Kan Liang)
- ... other misc changes.
Here's an incomplete list of the tooling enhancements (but there's
much more, see the shortlog and the git log for details):
- Support cross unwinding, i.e. collecting '--call-graph dwarf'
perf.data files in one machine and then doing analysis in another
machine of a different hardware architecture. This enables, for
instance, to do:
$ perf record -a --call-graph dwarf
on a x86-32 or aarch64 system and then do 'perf report' on it on a
x86_64 workstation (He Kuang)
- Allow reading from a backward ring buffer (one setup via
sys_perf_event_open() with perf_event_attr.write_backward = 1)
(Wang Nan)
- Finish merging initial SDT (Statically Defined Traces) support, see
cset comments for details about how it all works (Masami Hiramatsu)
- Support attaching eBPF programs to tracepoints (Wang Nan)
- Add demangling of symbols in programs written in the Rust language
(David Tolnay)
- Add support for tracepoints in the python binding, including an
example, that sets up and parses sched:sched_switch events,
tools/perf/python/tracepoint.py (Jiri Olsa)
- Introduce --stdio-color to set up the color output mode selection
in 'annotate' and 'report', allowing emit color escape sequences
when redirecting the output of these tools (Arnaldo Carvalho de
Melo)
- Add 'callindent' option to 'perf script -F', to indent the Intel PT
call stack, making this output more ftrace-like (Adrian Hunter,
Andi Kleen)
- Allow dumping the object files generated by llvm when processing
eBPF scriptlet events (Wang Nan)
- Add stackcollapse.py script to help generating flame graphs (Paolo
Bonzini)
- Add --ldlat option to 'perf mem' to specify load latency for loads
event (e.g. cpu/mem-loads/ ) (Jiri Olsa)
- Tooling support for Intel TopDown counters, recently added to the
kernel (Andi Kleen)"
* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (303 commits)
perf tests: Add is_printable_array test
perf tools: Make is_printable_array global
perf script python: Fix string vs byte array resolving
perf probe: Warn unmatched function filter correctly
perf cpu_map: Add more helpers
perf stat: Balance opening and reading events
tools: Copy linux/{hash,poison}.h and check for drift
perf tools: Remove include/linux/list.h from perf's MANIFEST
tools: Copy the bitops files accessed from the kernel and check for drift
Remove: kernel unistd*h files from perf's MANIFEST, not used
perf tools: Remove tools/perf/util/include/linux/const.h
perf tools: Remove tools/perf/util/include/asm/byteorder.h
perf tools: Add missing linux/compiler.h include to perf-sys.h
perf jit: Remove some no-op error handling
perf jit: Add missing curly braces
objtool: Initialize variable to silence old compiler
objtool: Add -I$(srctree)/tools/arch/$(ARCH)/include/uapi
perf record: Add --tail-synthesize option
perf session: Don't warn about out of order event if write_backward is used
perf tools: Enable overwrite settings
...
IEEE 802.1AE-2006 standard recommends that the ICV element in a MACsec
frame should not exceed 16 octets: add MACSEC_STD_ICV_LEN in uapi
definitions accordingly, and avoid accepting configurations where the ICV
length exceeds the standard value. Leave definition of MACSEC_MAX_ICV_LEN
unchanged for backwards compatibility with userspace programs.
Fixes: dece8d2b78 ("uapi: add MACsec bits")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Following the work that have been done on offloading classifiers like u32
and flower, now the match-all classifier hw offloading is possible. if
the interface supports tc offloading.
To control the offloading, two tc flags have been introduced: skip_sw and
skip_hw. Typical usage:
tc filter add dev eth25 parent ffff: \
matchall skip_sw \
action mirred egress mirror \
dev eth27
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
The matchall classifier matches every packet and allows the user to apply
actions on it. This filter is very useful in usecases where every packet
should be matched, for example, packet mirroring (SPAN) can be setup very
easily using that filter.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Yotam Gigi <yotamg@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Here is the big Staging and IIO driver update for 4.8-rc1.
We ended up adding more code than removing, again, but it's not all that
bad. Lots of cleanups all over the staging tree, and new IIO drivers,
full details in the shortlog.
All of these have been in linux-next for a while with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iFYEABECABYFAleVPQQPHGdyZWdAa3JvYWguY29tAAoJEDFH1A3bLfsplRgAniG6
jfPnvlHhl70T5HsGJzrc7VS9AKCBQ5x0gzTNxo2nnGfPmR8CVEH7Bg==
=0/6X
-----END PGP SIGNATURE-----
Merge tag 'staging-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging and IIO driver updates from Greg KH:
"Here is the big Staging and IIO driver update for 4.8-rc1.
We ended up adding more code than removing, again, but it's not all
that bad. Lots of cleanups all over the staging tree, and new IIO
drivers, full details in the shortlog.
All of these have been in linux-next for a while with no reported
issues"
* tag 'staging-4.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (417 commits)
drivers:iio:accel:mma8452: removed unwanted return statements
drivers:iio:accel:mma8452: added cleanup provision in case of failure.
iio: Add iio.git tree to MAINTAINERS
iio:st_pressure: clean useless static channel initializers
iio:st_pressure:lps22hb: temperature support
iio:st_pressure:lps22hb: open drain support
iio:st_pressure: temperature triggered buffering
iio:st_pressure: document sampling gains
iio:st_pressure: align storagebits on power of 2
iio:st_sensors: align on storagebits boundaries
staging:iio:lis3l02dq drop separate driver
iio: accel: st_accel: Add lis3l02dq support
iio: adc: add missing of_node references to iio_dev
iio: adc: ti-ads1015: add indio_dev->dev.of_node reference
iio: potentiometer: Fix typo in Kconfig
iio: potentiometer: mcp4531: Add device tree binding
iio: potentiometer: mcp4531: Add device tree binding documentation
iio: potentiometer: mcp4531: Add support for MCP454x, MCP456x, MCP464x and MCP466x
iio:imu:mpu6050: icm20608 initial support
iio: adc: max1363: Add device tree binding
...
* patchwork: (1492 commits)
[media] cec: always check all_device_types and features
[media] cec: poll should check if there is room in the tx queue
[media] vivid: support monitor all mode
[media] cec: fix test for unconfigured adapter in main message loop
[media] cec: limit the size of the transmit queue
[media] cec: zero unused msg part after msg->len
[media] cec: don't set fh to NULL in CEC_TRANSMIT
[media] cec: clear all status fields before transmit and always fill in sequence
[media] cec: CEC_RECEIVE overwrote the timeout field
[media] cxd2841er: Reading SNR for DVB-C added
[media] cxd2841er: Reading BER and UCB for DVB-C added
[media] cxd2841er: fix switch-case for DVB-C
[media] cxd2841er: fix signal strength scale for ISDB-T
[media] cxd2841er: adjust the dB scale for DVB-C
[media] cxd2841er: provide signal strength for DVB-C
[media] cxd2841er: fix BER report via DVBv5 stats API
[media] mb86a20s: apply mask to val after checking for read failure
[media] airspy: fix error logic during device register
[media] s5p-cec/TODO: add TODO item
[media] cec/TODO: drop comment about sphinx documentation
...
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
- GICv3 ITS emulation
- Simpler idmap management that fixes potential TLB conflicts
- Honor the kernel protection in HYP mode
- Removal of the old vgic implementation
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXkk6wAAoJECPQ0LrRPXpDkIQP/iJ2yXTxrfbJoyaVq1vuMn3R
UFhVwNXP8OEjQrmp5lvMBazB1MRBkNDzlVXL1fSb+ijKmbIELOqHhO6ijrkK4zmc
0Ie0x5Bt4gIFPTZyZORVpy1eU/0YFGWERAfsAjYdMCeKwHjaUCRSrZBXF2YsFTfo
Hh/ILvHa8TjUXWsQXvtZCL6AAnkDKBsbDWqsq5zspuT+PA8umI+dGLIiULXBpc4t
S2TCDxOU1JgsAn+Y0XVbPXV9id+bs5LRd6nNH/RmipIVqWmukSrScXOjg/po/l2S
laO4tHmyEeN6ecnCxWttpjacNwyTDNh5n3lL1ceBnBZFqn1k/7NjqV3fQzJxGd1T
1U6edE9+EuS9uXWF5XcEuAD660EiMs4FLVSjPgqYQtto3gOHilmuWL9eeeOOgCem
Lknnu/7G8h36PaQuLnEXWXQb7jeS2rTuC0RqxCG62gD9UWEJTckRz5pRh/e6gz7n
ZVXMrwGiVZ3zR78qE6i2j5CZ6A0BMAK3nZ85AI3kmgKg0CfVY28uPOj8llAOaYm+
0XVdfRj7ed75eu3GobjHUyZ0fQ40jovmH2vy3mupBm5XBUHgH/j6X510KJ1UTLWI
C2EO9KogbjoVeu60mQi4bKGSPi8/wdgYqVft/Qzl5D5iFvQ7Ia+TQNMArCQazBID
Ihe1E09NGrHjV3Yw/GWV
=2Del
-----END PGP SIGNATURE-----
Merge tag 'kvm-arm-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into next
KVM/ARM changes for Linux 4.8
- GICv3 ITS emulation
- Simpler idmap management that fixes potential TLB conflicts
- Honor the kernel protection in HYP mode
- Removal of the old vgic implementation
On ARM, the MSI msg (address and data) comes along with
out-of-band device ID information. The device ID encodes the
device that writes the MSI msg. Let's convey the device id in
kvm_irq_routing_msi and use KVM_MSI_VALID_DEVID flag value in
kvm_irq_routing_entry to indicate the msi devid is populated.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Andre Przywara <andre.przywara@arm.com>
Acked-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Change mapped device to implement direct_access function,
dm_blk_direct_access(), which calls a target direct_access function.
'struct target_type' is extended to have target direct_access interface.
This function limits direct accessible size to the dm_target's limit
with max_io_len().
Add dm_table_supports_dax() to iterate all targets and associated block
devices to check for DAX support. To add DAX support to a DM target the
target must only implement the direct_access function.
Add a new dm type, DM_TYPE_DAX_BIO_BASED, which indicates that mapped
device supports DAX and is bio based. This new type is used to assure
that all target devices have DAX support and remain that way after
QUEUE_FLAG_DAX is set in mapped device.
At initial table load, QUEUE_FLAG_DAX is set to mapped device when setting
DM_TYPE_DAX_BIO_BASED to the type. Any subsequent table load to the
mapped device must have the same type, or else it fails per the check in
table_load().
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
Add the official BPF ELF e_machine value that was assigned recently [1,2]
and will be propagated to glibc, et al. LLVM is switching to it in 3.9
release.
[1] 36b9c09330
[2] http://lists.iovisor.org/pipermail/iovisor-dev/2016-June/000266.html
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
XDP enabled drivers must transmit received packets back out on the same
port they were received on when a program returns this action.
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Sets the bpf program represented by fd as an early filter in the rx path
of the netdev. The fd must have been created as BPF_PROG_TYPE_XDP.
Providing a negative value as fd clears the program. Getting the fd back
via rtnl is not possible, therefore reading of this value merely
provides a bool whether the program is valid on the link or not.
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a new bpf prog type that is intended to run in early stages of the
packet rx path. Only minimal packet metadata will be available, hence a
new context type, struct xdp_md, is exposed to userspace. So far only
expose the packet start and end pointers, and only in read mode.
An XDP program must return one of the well known enum values, all other
return codes are reserved for future use. Unfortunately, this
restriction is hard to enforce at verification time, so take the
approach of warning at runtime when such programs are encountered. Out
of bounds return codes should alias to XDP_ABORTED.
Signed-off-by: Brenden Blanco <bblanco@plumgrid.com>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The NCSI command packets are sent from MC (Management Controller)
to remote end. They are used for multiple purposes: probe existing
NCSI package/channel, retrieve NCSI channel's capability, configure
NCSI channel etc.
This defines struct to represent NCSI command packets and introduces
function ncsi_xmit_cmd(), which will be used to transmit NCSI command
packet according to the request. The request is represented by struct
ncsi_cmd_arg.
Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Introduce a new KVM device that represents an ARM Interrupt Translation
Service (ITS) controller. Since there can be multiple of this per guest,
we can't piggy back on the existing GICv3 distributor device, but create
a new type of KVM device.
On the KVM_CREATE_DEVICE ioctl we allocate and initialize the ITS data
structure and store the pointer in the kvm_device data.
Upon an explicit init ioctl from userland (after having setup the MMIO
address) we register the handlers with the kvm_io_bus framework.
Any reference to an ITS thus has to go via this interface.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
The ARM GICv3 ITS MSI controller requires a device ID to be able to
assign the proper interrupt vector. On real hardware, this ID is
sampled from the bus. To be able to emulate an ITS controller, extend
the KVM MSI interface to let userspace provide such a device ID. For
PCI devices, the device ID is simply the 16-bit bus-device-function
triplet, which should be easily available to the userland tool.
Also there is a new KVM capability which advertises whether the
current VM requires a device ID to be set along with the MSI data.
This flag is still reported as not available everywhere, later we will
enable it when ITS emulation is used.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
We will use illegal instruction 0x0000 for handling 2 byte sw breakpoints
from user space. As it can be enabled dynamically via a capability,
let's move setting of ICTL_OPEREXC to the post creation step, so we avoid
any races when enabling that capability just while adding new cpus.
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Pull input fixes from Dmitry Torokhov:
"A few last-minute updates for the input subsystem"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: ts4800-ts - add missing of_node_put after calling of_parse_phandle
Input: synaptics-rmi4 - use of_get_child_by_name() to fix refcount
Revert "Input: wacom_w8001 - drop use of ABS_MT_TOOL_TYPE"
Input: xpad - validate USB endpoint count during probe
Input: add SW_PEN_INSERTED define
This work addresses a couple of issues bpf_skb_event_output()
helper currently has: i) We need two copies instead of just a
single one for the skb data when it should be part of a sample.
The data can be non-linear and thus needs to be extracted via
bpf_skb_load_bytes() helper first, and then copied once again
into the ring buffer slot. ii) Since bpf_skb_load_bytes()
currently needs to be used first, the helper needs to see a
constant size on the passed stack buffer to make sure BPF
verifier can do sanity checks on it during verification time.
Thus, just passing skb->len (or any other non-constant value)
wouldn't work, but changing bpf_skb_load_bytes() is also not
the proper solution, since the two copies are generally still
needed. iii) bpf_skb_load_bytes() is just for rather small
buffers like headers, since they need to sit on the limited
BPF stack anyway. Instead of working around in bpf_skb_load_bytes(),
this work improves the bpf_skb_event_output() helper to address
all 3 at once.
We can make use of the passed in skb context that we have in
the helper anyway, and use some of the reserved flag bits as
a length argument. The helper will use the new __output_custom()
facility from perf side with bpf_skb_copy() as callback helper
to walk and extract the data. It will pass the data for setup
to bpf_event_output(), which generates and pushes the raw record
with an additional frag part. The linear data used in the first
frag of the record serves as programmatically defined meta data
passed along with the appended sample.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This header contains the userspace API for lirc.
This is a fixup for commit b7be755733 ("[media] bz#75751: Move
internal header file lirc.h to uapi/"). It moved the header to the
right place, but it forgot to add it at Kbuild. So, despite being at
uapi, it is not copied to the right place.
Fixes: b7be755733 ("[media] bz#75751: Move internal header file lirc.h to uapi/")
Link: http://lkml.kernel.org/r/320c765d32bfc82c582e336d52ffe1026c73c644.1468439021.git.mchehab@s-opensource.com
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Cc: Alec Leamas <leamas.alec@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Add KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK as a feature flag to
KVM_CAP_X2APIC_API.
The quirk made KVM interpret 0xff as a broadcast even in x2APIC mode.
The enableable capability is needed in order to support standard x2APIC and
remain backward compatible.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
[Expand kvm_apic_mda comment. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
KVM_CAP_X2APIC_API is a capability for features related to x2APIC
enablement. KVM_X2APIC_API_32BIT_FORMAT feature can be enabled to
extend APIC ID in get/set ioctl and MSI addresses to 32 bits.
Both are needed to support x2APIC.
The feature has to be enableable and disabled by default, because
get/set ioctl shifted and truncated APIC ID to 8 bits by using a
non-standard protocol inspired by xAPIC and the change is not
backward-compatible.
Changes to MSI addresses follow the format used by interrupt remapping
unit. The upper address word, that used to be 0, contains upper 24 bits
of the LAPIC address in its upper 24 bits. Lower 8 bits are reserved as
0. Using the upper address word is not backward-compatible either as we
didn't check that userspace zeroed the word. Reserved bits are still
not explicitly checked, but non-zero data will affect LAPIC addresses,
which will cause a bug.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* topic/vsp1: (36 commits)
[media] v4l: vsp1: wpf: Add flipping support
[media] v4l: vsp1: rwpf: Support runtime modification of controls
[media] v4l: vsp1: Simplify alpha propagation
[media] v4l: vsp1: clu: Support runtime modification of controls
[media] v4l: vsp1: lut: Support runtime modification of controls
[media] v4l: vsp1: Support runtime modification of controls
[media] v4l: vsp1: Add Cubic Look Up Table (CLU) support
[media] v4l: vsp1: lut: Expose configuration through a control
[media] v4l: vsp1: lut: Initialize the mutex
[media] v4l: vsp1: dl: Don't free fragments with interrupts disabled
[media] v4l: vsp1: Set entities functions
[media] v4l: vsp1: Don't create LIF entity when the userspace API is enabled
[media] v4l: vsp1: Don't register media device when userspace API is disabled
[media] v4l: vsp1: Base link creation on availability of entities
[media] media: Add video statistics computation functions
[media] media: Add video processing entity functions
[media] v4l: vsp1: sru: Fix intensity control ID
[media] v4l: vsp1: Stop the pipeline upon the first STREAMOFF
[media] v4l: vsp1: Constify operation structures
[media] v4l: vsp1: pipe: Fix typo in comment
...
This is for the new pulse8-cec staging driver.
Signed-off-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
This patch adds SCTP_PR_ASSOC_STATUS to sctp sockopt, which is used
to dump the prsctp statistics info from the asoc. The prsctp statistics
includes abandoned_sent/unsent from the asoc. abandoned_sent is the
count of the packets we drop packets from retransmit/transmited queue,
and abandoned_unsent is the count of the packets we drop from out_queue
according to the policy.
Note: another option for prsctp statistics dump described in rfc is
SCTP_PR_STREAM_STATUS, which is used to dump the prsctp statistics
info from each stream. But by now, linux doesn't yet have per stream
statistics info, it needs rfc6525 to be implemented. As the prsctp
statistics for each stream has to be based on per stream statistics,
we will delay it until rfc6525 is done in linux.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds SCTP_DEFAULT_PRINFO to sctp sockopt. It is used
to set/get sctp Partially Reliable Policies' default params,
which includes 3 policies (ttl, rtx, prio) and their values.
Still, if we set policy params in sndinfo, we will use the params
of sndinfo against chunks, instead of the default params.
In this patch, we will use 5-8bit of sp/asoc->default_flags
to store prsctp policies, and reuse asoc->default_timetolive
to store their values. It means if we enable and set prsctp
policy, prior ttl timeout in sctp will not work any more.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
According to section 4.5 of rfc7496, prsctp_enable should be per asoc.
We will add prsctp_enable to both asoc and ep, and replace the places
where it used net.sctp->prsctp_enable with asoc->prsctp_enable.
ep->prsctp_enable will be initialized with net.sctp->prsctp_enable, and
asoc->prsctp_enable will be initialized with ep->prsctp_enable. We can
also modify it's value through sockopt SCTP_PR_SUPPORTED.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
While reviewing the documentation gaps on LIRC, it was
noticed that several ioctls aren't used by any LIRC drivers
(nor at staging or mainstream).
It doesn't make sense to document them, as they're not used
anywhere. So, let's remove those from the lirc header.
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
As was suggested this patch adds support for the different versions of MLD
and IGMP query types. Since the user visible structure is still in net-next
we can augment it instead of adding netlink attributes.
The distinction between the different IGMP/MLD query types is done as
suggested in Section 7.1, RFC 3376 [1] and Section 8.1, RFC 3810 [2] based
on query payload size and code for IGMP. Since all IGMP packets go through
multicast_rcv() and it uses ip_mc_check_igmp/ipv6_mc_check_mld we can be
sure that at least the ip/ipv6 header can be directly used.
[1] https://tools.ietf.org/html/rfc3376#section-7
[2] https://tools.ietf.org/html/rfc3810#section-8.1
Suggested-by: Linus Lüssing <linus.luessing@c0d3.blue>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
over time there were multiple requests to access different data
structures and fields of task_struct current, so finally add
the helper to access 'current' as-is. Tracing bpf programs will do
the rest of walking the pointers via bpf_probe_read().
Note that current can be null and bpf program has to deal it with,
but even dumb passing null into bpf_probe_read() is still safe.
Suggested-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXefulAAoJEHm+PkMAQRiG6nMH/2O1vcZeOtqmx2yCMUeXyKAT
wG88XflXzf3rM7C7TiObEYVf/bbLleJ7saDLEeic7ButD5gyYacIuzylVnrcqfBc
vinz4cOw5kvu9DrRkCKdOfiTAgwYtqQW+syJ8ZK4lPQuSxnPAs+F/FKSOpyUF5FN
Dngr520KjYKBEtn27W9UDPChFRwQoWAlaOC534eusaArCJtHGHHiuq5TEDn2EIo8
pUw2vwx5JiquSHOY34WLU7r+QoilovCQlUSsBQdLlPjfMB1QFtclPYa+5yEMjkT4
wusOUOfS/zK0rV6KnEdc/SkpiVX5C9WpFiWUOdEeJ5mZ+KijVkaOa9K1EDx8jSM=
=7Hwh
-----END PGP SIGNATURE-----
Merge tag 'v4.7-rc6' into patchwork
Linux 4.7-rc6
* tag 'v4.7-rc6': (1245 commits)
Linux 4.7-rc6
ovl: warn instead of error if d_type is not supported
MIPS: Fix possible corruption of cache mode by mprotect.
locks: use file_inode()
usb: dwc3: st: Use explicit reset_control_get_exclusive() API
phy: phy-stih407-usb: Use explicit reset_control_get_exclusive() API
phy: miphy28lp: Inform the reset framework that our reset line may be shared
namespace: update event counter when umounting a deleted dentry
9p: use file_dentry()
lockd: unregister notifier blocks if the service fails to come up completely
ACPI,PCI,IRQ: correct operator precedence
fuse: serialize dirops by default
drm/i915: Fix missing unlock on error in i915_ppgtt_info()
powerpc: Initialise pci_io_base as early as possible
mfd: da9053: Fix compiler warning message for uninitialised variable
mfd: max77620: Fix FPS switch statements
phy: phy-stih407-usb: Inform the reset framework that our reset line may be shared
usb: dwc3: st: Inform the reset framework that our reset line may be shared
usb: host: ehci-st: Inform the reset framework that our reset line may be shared
usb: host: ohci-st: Inform the reset framework that our reset line may be shared
...
* beacon report (for radio measurement) support in cfg80211/mac80211
* hwsim: allow wmediumd in namespaces
* mac80211: extend 160MHz workaround to CSA IEs
* mesh: properly encrypt group-addressed privacy action frames
* mesh: allow setting peer AID
* first steps for MU-MIMO monitor mode
* along with various other cleanups and improvements
-----BEGIN PGP SIGNATURE-----
iQIcBAABCgAGBQJXfSCuAAoJEGt7eEactAAdaugP/ilrcELQRsIN5ZXCAKYZuwXV
T01JPwgOaWL9ILu7h1SfG/+j9kzMnyk4WmRpeoj2FGNcyfG2AvULWSLpQJ2abwgQ
8o/emuLinQwRENevaMUTRSOE0HkXoFPCbbq37+a2i6bAv1QSYY3A0xvWpcU5fZ4D
7CKYDYPBAdMXYwEwy1g4nYWfDAYqS4rthr3l3rS1Cy7Q2T1ZlMlD90GjD7oeQAEw
orKulhkkDSzvxfvZCYTzXmUoBQE8sNXGDD+OFsJyowyt+ugM/xan+2tmhCaHSnda
HpdCS2aRj779UBn9cOfELjffTNpS++PM6KFd8ZDaPcJSMginn/BAHTOeNfNUfL0Q
+Enu59I82qMDzbG2z1Qezzjv7OTzyEvyvYzNbLOqljTBSBklDa3rHwhyk+g1oVBH
+4xX1Vk5QBLde+Q0NS0gTkcqOQK8KT5+HEqiUfgLSNDETN0lSGsKbtvMfU/ikE1Y
aLRkTp7nzUd03qjIFLS6RMf7JjucWWzH1ZXTHvbpDFAG7riOhYRD3Sw+0e7madTd
+HXjH9dnOGnGPDL+FyDwtW6iclYwNjcIPQiNdOjwWfMA2Wmr7iq+aFUptRCwQTHB
WtgJ3f8OHax2JXcm4grYfxELZip5vbWJJHUC84Drvmzw3X7FRITf+OEWjdNOzsRD
Fc7w5ceThh9Id1BvcH2+
=R1qP
-----END PGP SIGNATURE-----
Merge tag 'mac80211-next-for-davem-2016-07-06' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
Johannes Berg says:
====================
One more set of new features:
* beacon report (for radio measurement) support in cfg80211/mac80211
* hwsim: allow wmediumd in namespaces
* mac80211: extend 160MHz workaround to CSA IEs
* mesh: properly encrypt group-addressed privacy action frames
* mesh: allow setting peer AID
* first steps for MU-MIMO monitor mode
* along with various other cleanups and improvements
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Conflicts:
drivers/net/ethernet/mellanox/mlx5/core/en.h
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
drivers/net/usb/r8152.c
All three conflicts were overlapping changes.
Signed-off-by: David S. Miller <davem@davemloft.net>
Pablo Neira Ayuso says:
====================
Netfilter updates for net-next
The following patchset contains Netfilter updates for net-next,
they are:
1) Don't use userspace datatypes in bridge netfilter code, from
Tobin Harding.
2) Iterate only once over the expectation table when removing the
helper module, instead of once per-netns, from Florian Westphal.
3) Extra sanitization in xt_hook_ops_alloc() to return error in case
we ever pass zero hooks, xt_hook_ops_alloc():
4) Handle NFPROTO_INET from the logging core infrastructure, from
Liping Zhang.
5) Autoload loggers when TRACE target is used from rules, this doesn't
change the behaviour in case the user already selected nfnetlink_log
as preferred way to print tracing logs, also from Liping Zhang.
6) Conntrack slabs with SLAB_HWCACHE_ALIGN to allow rearranging fields
by cache lines, increases the size of entries in 11% per entry.
From Florian Westphal.
7) Skip zone comparison if CONFIG_NF_CONNTRACK_ZONES=n, from Florian.
8) Remove useless defensive check in nf_logger_find_get() from Shivani
Bhardwaj.
9) Remove zone extension as place it in the conntrack object, this is
always include in the hashing and we expect more intensive use of
zones since containers are in place. Also from Florian Westphal.
10) Owner match now works from any namespace, from Eric Bierdeman.
11) Make sure we only reply with TCP reset to TCP traffic from
nf_reject_ipv4, patch from Liping Zhang.
12) Introduce --nflog-size to indicate amount of network packet bytes
that are copied to userspace via log message, from Vishwanath Pai.
This obsoletes --nflog-range that has never worked, it was designed
to achieve this but it has never worked.
13) Introduce generic macros for nf_tables object generation masks.
14) Use generation mask in table, chain and set objects in nf_tables.
This allows fixes interferences with ongoing preparation phase of
the commit protocol and object listings going on at the same time.
This update is introduced in three patches, one per object.
15) Check if the object is active in the next generation for element
deactivation in the rbtree implementation, given that deactivation
happens from the commit phase path we have to observe the future
status of the object.
16) Support for deletion of just added elements in the hash set type.
17) Allow to resize hashtable from /proc entry, not only from the
obscure /sys entry that maps to the module parameter, from Florian
Westphal.
18) Get rid of NFT_BASECHAIN_DISABLED, this code is not exercised
anymore since we tear down the ruleset whenever the netdevice
goes away.
19) Support for matching inverted set lookups, from Arturo Borrero.
20) Simplify the iptables_mangle_hook() by removing a superfluous
extra branch.
21) Introduce ether_addr_equal_masked() and use it from the netfilter
codebase, from Joe Perches.
22) Remove references to "Use netfilter MARK value as routing key"
from the Netfilter Kconfig description given that this toggle
doesn't exists already for 10 years, from Moritz Sichert.
23) Introduce generic NF_INVF() and use it from the xtables codebase,
from Joe Perches.
24) Setting logger to NONE via /proc was not working unless explicit
nul-termination was included in the string. This fixes seems to
leave the former behaviour there, so we don't break backward.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Previously, mesh power management functionality works only with kernel
MPM. Because user space MPM did not report mesh peer AID to kernel,
the kernel could not identify the bit in TIM element. So this patch
adds mesh peer AID setting API.
Signed-off-by: Masashi Honma <masashi.honma@gmail.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Beacon report radio measurement requires reporting observed BSSs
on the channels specified in the beacon request. If the measurement
mode is set to passive or active, it requires actually performing a
scan (passive or active, accordingly), and reporting the time that
the scan was started and the time each beacon/probe was received
(both in terms of TSF of the BSS of the requesting AP). If the
request mode is table, this information is optional.
In addition, the radio measurement request specifies the channel
dwell time for the measurement.
In order to use scan for beacon report when the mode is active or
passive, add a parameter to scan request that specifies the
channel dwell time, and add scan start time and beacon received time
to scan results information.
Supporting beacon report is required for Multi Band Operation (MBO).
Signed-off-by: Assaf Krauss <assaf.krauss@intel.com>
Signed-off-by: David Spinadel <david.spinadel@intel.com>
Signed-off-by: Avraham Stern <avraham.stern@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
add API to support VHT MU-MIMO air sniffer.
in MU-MIMO there are parallel frames on the air while the HW
has only one RX.
add the capability to sniff one of the MU-MIMO parallel frames by
giving the sniffer additional information so it'll know which
of the parallel frames it shall follow.
Add attribute - NL80211_ATTR_MU_MIMO_GROUP_DATA - for getting
a MU-MIMO groupID in order to monitor packets from that group
using VHT MU-MIMO.
And add attribute -NL80211_ATTR_MU_MIMO_FOLLOW_ADDR - for passing
MAC address to monitor mode.
that option will be used by VHT MU-MIMO air sniffer to follow a
station according to it's MAC address using VHT MU-MIMO.
Signed-off-by: Aviya Erenfeld <aviya.erenfeld@intel.com>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
- Cleanup work by Markus Pargmann and Sven Eckelmann (six patches)
- Initial Netlink support by Matthias Schiffer (two patches)
- Throughput Meter implementation by Antonio Quartulli, a kernel-space
traffic generator to estimate link speeds. This feature is useful on
low-end WiFi APs where running iperf or netperf from userspace
gives wrong results due to heavy userspace/kernelspace overhead.
(two patches)
- API clean-up work by Antonio Quartulli (one patch)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCgAGBQJXel3cAAoJEKEr45hCkp6hxs0P/jQZBJ37Bd4EHRGdhvCJWwsO
j79zr7mIECub8a6PMkO1GI87ksJNtRdRw7XAIbLKTwsKEsUE0Gpv/MLLKgv/nD7X
zatcoI4DujkgSojZIcOV/061+M9FAnCtAYv13jIS8nbXdqfGPxPfLua6Zbvx1GS2
z/Rqg/WbB2qDtDlUrp0W/8oXQ+k6062G7GigroPLmjdWd5lF0H6ly4loWsxFyr0U
GVl44HM4nOj7DwkVlrGoOXnAbjpz9TNC/aA5TIS/tLFZkm5dvJjjKLDbxo5NM9aq
hRhFy8Gbe0TmOxV3mKZUT1oHuaHgFDY2tADLiLF2g/ijgaTetXCBJ6DXQ7BkiZnh
+t1fnutOB1D05+cZGDmlfb2bFXO6vdDwNzKYuPdeW0tUOVwzNIaMK+US1HffUD3F
ciK/cALsLbfJ3QkUHJclE57baMuB2c7YWJUxGdp2r4lKHak6tc8+BsornI6lB6qY
kcxip6EEaT7edjT66Qjq8GtFK7WIri5nHI9n5Js+Wwl1QENvkLmZRQ6qZexwSplS
RTZmmO+i+Y4rGDa3xoVSlC+CEUO7D4VwhET2Jir7KJrVS+pFNRAmfpUNWxeauAls
D1xWgBrWjjOYu5i3LjwC6cHl4eTWxBwWmBUaxLUUeyoR22ulIs6bXCQMWOLMbupd
q8k2B5BW9waTAgb4Tam9
=PFHu
-----END PGP SIGNATURE-----
Merge tag 'batadv-next-for-davem-20160704' of git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
This feature patchset includes the following changes:
- Cleanup work by Markus Pargmann and Sven Eckelmann (six patches)
- Initial Netlink support by Matthias Schiffer (two patches)
- Throughput Meter implementation by Antonio Quartulli, a kernel-space
traffic generator to estimate link speeds. This feature is useful on
low-end WiFi APs where running iperf or netperf from userspace
gives wrong results due to heavy userspace/kernelspace overhead.
(two patches)
- API clean-up work by Antonio Quartulli (one patch)
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
If skb_clear_hash() was invoked due to mangling of relevant headers and
BPF program needs skb->hash later on, we can add a helper to trigger hash
recalculation via bpf_get_hash_recalc().
The helper will return the newly retrieved hash directly, but later access
can also be done via skb context again through skb->hash directly (inline)
without needing to call the helper once more.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Extremely useful for setting packet type to host so i dont
have to modify the dst mac address using pedit (which requires
that i know the mac address)
Example usage:
tc filter add dev eth0 parent ffff: protocol ip pref 9 u32 \
match ip src 5.5.5.5/32 \
flowid 1:5 action skbedit ptype host
This will tag all packets incoming from 5.5.5.5 with type
PACKET_HOST
Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The throughput meter module is a simple, kernel-space replacement for
throughtput measurements tool like iperf and netperf. It is intended to
approximate TCP behaviour.
It is invoked through batctl: the protocol is connection oriented, with
cumulative acknowledgment and a dynamic-size sliding window.
The test *can* be interrupted by batctl. A receiver side timeout avoids
unlimited waitings for sender packets: after one second of inactivity, the
receiver abort the ongoing test.
Based on a prototype from Edo Monticelli <montik@autistici.org>
Signed-off-by: Antonio Quartulli <antonio.quartulli@open-mesh.com>
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
BATADV_CMD_GET_MESH_INFO is used to query basic information about a
batman-adv softif (name, index and MAC address for both the softif and
the primary hardif; routing algorithm; batman-adv version).
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: Reduce the number of changes to
BATADV_CMD_GET_MESH_INFO, add missing kerneldoc, add policy for attributes]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
debugfs is currently severely broken virtually everywhere in the kernel
where files are dynamically added and removed (see
http://lkml.iu.edu/hypermail/linux/kernel/1506.1/02196.html for some
details). In addition to that, debugfs is not namespace-aware.
Instead of adding new debugfs entries, the whole infrastructure should be
moved to netlink. This will fix the long standing problem of large buffers
for debug tables and hard to parse text files.
Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
[sven.eckelmann@open-mesh.com: Strip down patch to only add genl family,
add missing kerneldoc]
Signed-off-by: Sven Eckelmann <sven.eckelmann@open-mesh.com>
Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Pull fuse fix from Miklos Szeredi:
"This makes sure userspace filesystems are not broken by the parallel
lookups and readdir feature"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
fuse: serialize dirops by default
Add the commands to set and show the mode of SRIOV E-Switch, two modes
are supported:
* legacy: operating in the "old" L2 based mode (DMAC --> VF vport)
* switchdev: the E-Switch is referred to as whitebox switch configured
using standard tools such as tc, bridge, openvswitch etc. To allow
working with the tools, for each VF, a VF representor netdevice is
created by the E-Switch manager vendor device driver instance (e.g PF).
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Adds a bpf helper, bpf_skb_in_cgroup, to decide if a skb->sk
belongs to a descendant of a cgroup2. It is similar to the
feature added in netfilter:
commit c38c4597e4 ("netfilter: implement xt_cgroup cgroup2 path match")
The user is expected to populate a BPF_MAP_TYPE_CGROUP_ARRAY
which will be used by the bpf_skb_in_cgroup.
Modifications to the bpf verifier is to ensure BPF_MAP_TYPE_CGROUP_ARRAY
and bpf_skb_in_cgroup() are always used together.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add a BPF_MAP_TYPE_CGROUP_ARRAY and its bpf_map_ops's implementations.
To update an element, the caller is expected to obtain a cgroup2 backed
fd by open(cgroup2_dir) and then update the array with that fd.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Tejun Heo <tj@kernel.org>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
We found that sometimes a restored tcp socket doesn't work.
A reason of this bug is incorrect window parameters and in this case
tcp_acceptable_seq() returns tcp_wnd_end(tp) instead of tp->snd_nxt. The
other side drops packets with this seq, because seq is less than
tp->rcv_nxt ( tcp_sequence() ).
Data from a send queue is sent only if there is enough space in a
window, so when we restore unacked data, we need to expand a window to
fit this data.
This was in a first version of this patch:
"tcp: extend window to fit all restored unacked data in a send queue"
Then Alexey recommended me to restore window parameters instead of
adjusted them according with data in a sent queue. This sounds resonable.
rcv_wnd has to be restored, because it was reported to another side
and the offered window is never shrunk.
One of reasons why we need to restore snd_wnd was described above.
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Negotiate with userspace filesystems whether they support parallel readdir
and lookup. Disable parallelism by default for fear of breaking fuse
filesystems.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Fixes: 9902af79c0 ("parallel lookups: actual switch to rwsem")
Fixes: d9b3dbdcfd ("fuse: switch to ->iterate_shared()")
This patch adds stats support for the currently used IGMP/MLD types by the
bridge. The stats are per-port (plus one stat per-bridge) and per-direction
(RX/TX). The stats are exported via netlink via the new linkxstats API
(RTM_GETSTATS). In order to minimize the performance impact, a new option
is used to enable/disable the stats - multicast_stats_enabled, similar to
the recent vlan stats. Also in order to avoid multiple IGMP/MLD type
lookups and checks, we make use of the current "igmp" member of the bridge
private skb->cb region to record the type on Rx (both host-generated and
external packets pass by multicast_rcv()). We can do that since the igmp
member was used as a boolean and all the valid IGMP/MLD types are positive
values. The normal bridge fast-path is not affected at all, the only
affected paths are the flooding ones and since we make use of the IGMP/MLD
type, we can quickly determine if the packet should be counted using
cache-hot data (cb's igmp member). We add counters for:
* IGMP Queries
* IGMP Leaves
* IGMP v1/v2/v3 reports
* MLD Queries
* MLD Leaves
* MLD v1/v2 reports
These are invaluable when monitoring or debugging complex multicast setups
with bridges.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds support for the IFLA_STATS_LINK_XSTATS_SLAVE attribute
which allows to export per-slave statistics if the master device supports
the linkxstats callback. The attribute is passed down to the linkxstats
callback and it is up to the callback user to use it (an example has been
added to the only current user - the bridge). This allows us to query only
specific slaves of master devices like bridge ports and export only what
we're interested in instead of having to dump all ports and searching only
for a single one. This will be used to export per-port IGMP/MLD stats and
also per-port vlan stats in the future, possibly other statistics as well.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This work adds a helper for changing skb->pkt_type in a controlled way.
We only allow a subset of possible values and can extend that in future
should other use cases come up. Doing this as a helper has the advantage
that errors can be handeled gracefully and thus helper kept extensible.
It's a write counterpart to pkt_type member we can already read from
struct __sk_buff context. Major use case is to change incoming skbs to
PACKET_HOST in a programmatic way instead of having to recirculate via
redirect(..., BPF_F_INGRESS), for example.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch adds a minimal helper for doing the groundwork of changing
the skb->protocol in a controlled way. Currently supported is v4 to
v6 and vice versa transitions, which allows f.e. for a minimal, static
nat64 implementation where applications in containers that still
require IPv4 can be transparently operated in an IPv6-only environment.
For example, host facing veth of the container can transparently do
the transitions in a programmatic way with the help of clsact qdisc
and cls_bpf.
Idea is to separate concerns for keeping complexity of the helper
lower, which means that the programs utilize bpf_skb_change_proto(),
bpf_skb_store_bytes() and bpf_lX_csum_replace() to get the job done,
instead of doing everything in a single helper (and thus partially
duplicating helper functionality). Also, bpf_skb_change_proto()
shouldn't need to deal with raw packet data as this is done by other
helpers.
bpf_skb_proto_6_to_4() and bpf_skb_proto_4_to_6() unclone the skb to
operate on a private one, push or pop additionally required header
space and migrate the gso/gro meta data from the shared info. We do
mark the gso type as dodgy so that headers are checked and segs
recalculated by the gso/gro engine. The gso_size target is adapted
as well. The flags argument added is currently reserved and can be
used for future extensions.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Follow-up commit to 1e33759c78 ("bpf, trace: add BPF_F_CURRENT_CPU
flag for bpf_perf_event_output") to add the same functionality into
bpf_perf_event_read() helper. The split of index into flags and index
component is also safe here, since such large maps are rejected during
map allocation time.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Several cases of overlapping changes, except the packet scheduler
conflicts which deal with the addition of the free list parameter
to qdisc_enqueue().
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull networking fixes from David Miller:
"I've been traveling so this accumulates more than week or so of bug
fixing. It perhaps looks a little worse than it really is.
1) Fix deadlock in ath10k driver, from Ben Greear.
2) Increase scan timeout in iwlwifi, from Luca Coelho.
3) Unbreak STP by properly reinjecting STP packets back into the
stack. Regression fix from Ido Schimmel.
4) Mediatek driver fixes (missing malloc failure checks, leaking of
scratch memory, wrong indexing when mapping TX buffers, etc.) from
John Crispin.
5) Fix endianness bug in icmpv6_err() handler, from Hannes Frederic
Sowa.
6) Fix hashing of flows in UDP in the ruseport case, from Xuemin Su.
7) Fix netlink notifications in ovs for tunnels, delete link messages
are never emitted because of how the device registry state is
handled. From Nicolas Dichtel.
8) Conntrack module leaks kmemcache on unload, from Florian Westphal.
9) Prevent endless jump loops in nft rules, from Liping Zhang and
Pablo Neira Ayuso.
10) Not early enough spinlock initialization in mlx4, from Eric
Dumazet.
11) Bind refcount leak in act_ipt, from Cong WANG.
12) Missing RCU locking in HTB scheduler, from Florian Westphal.
13) Several small MACSEC bug fixes from Sabrina Dubroca (missing RCU
barrier, using heap for SG and IV, and erroneous use of async flag
when allocating AEAD conext.)
14) RCU handling fix in TIPC, from Ying Xue.
15) Pass correct protocol down into ipv4_{update_pmtu,redirect}() in
SIT driver, from Simon Horman.
16) Socket timer deadlock fix in TIPC from Jon Paul Maloy.
17) Fix potential deadlock in team enslave, from Ido Schimmel.
18) Memory leak in KCM procfs handling, from Jiri Slaby.
19) ESN generation fix in ipv4 ESP, from Herbert Xu.
20) Fix GFP_KERNEL allocations with locks held in act_ife, from Cong
WANG.
21) Use after free in netem, from Eric Dumazet.
22) Uninitialized last assert time in multicast router code, from Tom
Goff.
23) Skip raw sockets in sock_diag destruction broadcast, from Willem
de Bruijn.
24) Fix link status reporting in thunderx, from Sunil Goutham.
25) Limit resegmentation of retransmit queue so that we do not
retransmit too large GSO frames. From Eric Dumazet.
26) Delay bpf program release after grace period, from Daniel
Borkmann"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (141 commits)
openvswitch: fix conntrack netlink event delivery
qed: Protect the doorbell BAR with the write barriers.
neigh: Explicitly declare RCU-bh read side critical section in neigh_xmit()
e1000e: keep VLAN interfaces functional after rxvlan off
cfg80211: fix proto in ieee80211_data_to_8023 for frames without LLC header
qlcnic: use the correct ring in qlcnic_83xx_process_rcv_ring_diag()
bpf, perf: delay release of BPF prog after grace period
net: bridge: fix vlan stats continue counter
tcp: do not send too big packets at retransmit time
ibmvnic: fix to use list_for_each_safe() when delete items
net: thunderx: Fix TL4 configuration for secondary Qsets
net: thunderx: Fix link status reporting
net/mlx5e: Reorganize ethtool statistics
net/mlx5e: Fix number of PFC counters reported to ethtool
net/mlx5e: Prevent adding the same vxlan port
net/mlx5e: Check for BlueFlame capability before allocating SQ uar
net/mlx5e: Change enum to better reflect usage
net/mlx5: Add ConnectX-5 PCIe 4.0 to list of supported devices
net/mlx5: Update command strings
net: marvell: Add separate config ANEG function for Marvell 88E1111
...
Some devices with a pen may have a switch that can be used to detect
when the pen is inserted or removed to a slot on the device. Let's add
a define to the input event codes so that everyone can be on the same
page for what event we should generate when the pen is inserted or
removed.
In general the pen switch could be used by the software on the device to
kick off any number of actions when the pen is inserted or removed.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Replace the custom ioctl with a V4L2 control in order to standardize the
API.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The video statistics function describes entities such as video histogram
engines or 3A statistics engines.
Signed-off-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com>
Acked-by: Hans Verkuil <hans.verkuil@cisco.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
The formats are interleaved with the YUV packed and miscellaneous
formats, making the result confusing especially with the YUV444 format
being packed and not planar like YUV410 or YUV420. Move them to their
own group as the 2 planes or 3 non-contiguous planes formats to clarify
the header.
Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Add support to inet_diag facility to filter sockets based on device
index. If an interface index is in the filter only sockets bound
to that index (sk_bound_dev_if) are returned.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Pull input fixes from Dmitry Torokhov.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: vmmouse - remove port reservation
Input: elantech - add more IC body types to the list
Input: wacom_w8001 - ignore invalid pen data packets
Input: wacom_w8001 - w8001_MAX_LENGTH should be 13
Input: xpad - fix oops when attaching an unknown Xbox One gamepad
MAINTAINERS: add Pali Rohár as reviewer of ALPS PS/2 touchpad driver
Input: add HDMI CEC specific keycodes
Input: add BUS_CEC type
Input: xpad - fix rumble on Xbox One controllers with 2015 firmware
CALIPSO is a hop-by-hop IPv6 option. A lot of this patch is based on
the equivalent CISPO code. The main difference is due to manipulating
the options in the hop-by-hop header.
Signed-off-by: Huw Davies <huw@codeweavers.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>